stress test deadlock involving vm and/or geom
Don Lewis
truckman at FreeBSD.org
Wed Sep 28 19:14:58 PDT 2005
After running Peter Holm's stress test suite for several hours while
creating and removing snapshots, I got a system deadlock that doesn't
appear to be snapshot related, and happened while no snapshots were
present. This deadlock appears to be caused by a vm shortage.
Kernel memory usage didn't look excessive in a post-mortem "vmstat -m"
check.
Large numbers of processes were waiting on "vmwait", "pfault",
and "biord". The processes waiting on "vmwait" were waiting for the
pagedaemon to free up some memory.
The pagedaemon was waiting on "wswbuf0", which indicates it was
waiting for a pbuf, which indicates that all of its pbufs were
currently in use.
I'm not sure where the pagedaemons I/O requests were stuck, but I
started looking at geom. The g_up and g_event threads appeared to be
idle, but g_down was waiting on "g_down".
The stack trace for g_down is:
(kgdb) where
#0 0xc0653933 in sched_switch (td=0xc2275480, newtd=0xc2274780, flags=1)
at /usr/src/sys/kern/sched_4bsd.c:973
#1 0xc0649178 in mi_switch (flags=1, newtd=0x0)
at /usr/src/sys/kern/kern_synch.c:356
#2 0xc066075c in sleepq_switch (wchan=0x0)
at /usr/src/sys/kern/subr_sleepqueue.c:427
#3 0xc06609bc in sleepq_timedwait (wchan=0xe35e0cf4)
at /usr/src/sys/kern/subr_sleepqueue.c:568
#4 0xc0648dc2 in msleep (ident=0xe35e0cf4, mtx=0x0, priority=76,
wmesg=0xc08695b3 "g_down", timo=100) at /usr/src/sys/kern/kern_synch.c:223
#5 0xc0606958 in g_io_schedule_down (tp=0xc2275480)
at /usr/src/sys/geom/geom_io.c:418
#6 0xc0606eb6 in g_down_procbody () at /usr/src/sys/geom/geom_kern.c:118
#7 0xc06303bc in fork_exit (callout=0xc0606e5c <g_down_procbody>, arg=0x0,
frame=0xe35e0d38) at /usr/src/sys/kern/kern_fork.c:786
#8 0xc07f5f9c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208
There appear to be 559 items stuck in its queue:
(kgdb) frame 5
#5 0xc0606958 in g_io_schedule_down (tp=0xc2275480)
at /usr/src/sys/geom/geom_io.c:418
418 msleep(&error, NULL, PRIBIO, "g_down", hz/10);
(kgdb) print g_bio_run_down
$1 = {bio_queue = {tqh_first = 0xc38da318, tqh_last = 0xc38da248},
bio_queue_lock = {mtx_object = {lo_class = 0xc08d5fe4,
lo_name = 0xc0853a4c "bio queue", lo_type = 0xc0853a4c "bio queue",
lo_flags = 196608, lo_list = {tqe_next = 0xc0936788,
tqe_prev = 0xc0936810}, lo_witness = 0xc0946a38}, mtx_lock = 4,
mtx_recurse = 0}, bio_queue_length = 559}
g_down is sitting at msleep() in this block of code in
g_io_schedule_down():
(kgdb) list
413 }
414 CTR0(KTR_GEOM, "g_down has work to do");
415 g_bioq_unlock(&g_bio_run_down);
416 if (pace > 0) {
417 CTR1(KTR_GEOM, "g_down pacing self (pace %d)", pace);
418 msleep(&error, NULL, PRIBIO, "g_down", hz/10);
419 pace--;
420 }
421 error = g_io_check(bp);
This is consistent with pace > 0:
(kgdb) print pace
$2 = 1
What is curious is that pace is only incremented when g_io_deliver() is
called with error=ENOMEM, but this can only happen in this code:
422 if (error) {
423 CTR3(KTR_GEOM, "g_down g_io_check on bp %p provider "
424 "%s returned %d", bp, bp->bio_to->name, error);
425 g_io_deliver(bp, error);
426 continue;
427 }
428 CTR2(KTR_GEOM, "g_down processing bp %p provider %s", bp,
429 bp->bio_to->name);
and error doesn't get changed anywhere else in this code, so it should
have the same value as the previous iteration through the loop, *but* it
is 0 ...
(kgdb) print error
$6 = 0
Also, g_io_check() can only return ENOMEM if that is the provider error
status, but the provider seems to be happy:
(kgdb) print *(bp->bio_to)
$5 = {name = 0xc258c8d8 "da0", provider = {le_next = 0x0,
le_prev = 0xc258c794}, geom = 0xc258c780, consumers = {
lh_first = 0xc24671c0}, acr = 3, acw = 3, ace = 5, error = 0, orphan = {
tqe_next = 0x0, tqe_prev = 0x0}, mediasize = 36703933440,
sectorsize = 512, stripesize = 0, stripeoffset = 0, stat = 0xc25c35a0,
nstart = 6007328, nend = 6006932, flags = 0, private = 0x0, index = 0}
More information about the freebsd-current
mailing list