FreeBSD 9.0-RC1 freeze after swapoff/swapon procedure on md/vnode-backend file

Wed Oct 26 22:27:06 UTC 2011

On 26 October 2011 23:31, Subbsd <subbsd at gmail.com> wrote:
> Hi
>
> I get easy reproducible  a hang-up servers that use the file-based
> swap file after swapoff / swapon procedure (in this case, some of the
> data must be swapped). For example:
>
> 1) dd if=/dev/zero of=/usr/swp1 bs=1m count=100
> 2) mdconfig -a -t vnode -f /usr/swp1
> 3) swapon /dev/md0
> 4) begin to allocated memory, for example by simple:
> tail /dev/zero
>
> 5) after a filling of some percent, do swapoff /dev/md0, then swapon
> /dev/md0. you can try this procedure again.
>
> The system may stop responding to commands and freezes or locks up
> after some time. From the outside - the core lives (icmp response
> goes) but the disk system is not available.
>
> PS: one of my server to my mind is frozen without swapoff/on - just
> had three swapfile, a day after he crashed.

Something interesting while trying to reproduce your problem:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048970, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1057174, size: 4096
panic: swapoff: failed to locate 16056 swap blocks
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802e009a = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff80486d87 = kdb_backtrace+0x37
panic() at 0xffffffff8044f6ee = panic+0x2ee
swapoff_one() at 0xffffffff80687425 = swapoff_one+0x475
sys_swapoff() at 0xffffffff8068789b = sys_swapoff+0x1bb
amd64_syscall() at 0xffffffff806c60c9 = amd64_syscall+0x299
Xfast_syscall() at 0xffffffff806b1467 = Xfast_syscall+0xf7
--- syscall (424, FreeBSD ELF64, sys_swapoff), rip = 0x800ab307c, rsp
= 0x7fffffffd9c8, rbp = 0 ---
KDB: enter: panic
[ thread pid 63255 tid 100211 ]
Stopped at      0xffffffff8048696b = kdb_enter+0x3b:    movq
$0,0x735f72(%rip)

Below is a trace for a process on another CPU that's doing intensive
malloc+bzero in userland:

db> bt 63066
Tracing pid 63066 tid 100199 td 0xfffffe000e89f000
cpustop_handler() at 0xffffffff806bb46b = cpustop_handler+0x2b
ipi_nmi_handler() at 0xffffffff806bb540 = ipi_nmi_handler+0x50
trap() at 0xffffffff806c7035 = trap+0x2a5
nmi_calltrap() at 0xffffffff806b15bf = nmi_calltrap+0x8
--- trap 0x13, rip = 0xffffffff8043e0d0, rsp = 0xffffffff80dc4dc0, rbp
= 0xffffff80908de750 ---
_mtx_unlock_flags() at 0xffffffff8043e0d0 = _mtx_unlock_flags+0x170
swp_pager_meta_ctl() at 0xffffffff806841aa = swp_pager_meta_ctl+0xea
swap_pager_haspage() at 0xffffffff80684272 = swap_pager_haspage+0x42
vm_fault_hold() at 0xffffffff8068e379 = vm_fault_hold+0x599
trap_pfault() at 0xffffffff806c6c26 = trap_pfault+0xe6
trap() at 0xffffffff806c733f = trap+0x5af
calltrap() at 0xffffffff806b1183 = calltrap+0x8
--- trap 0xc, rip = 0x4006ed, rsp = 0x7fffffffdad0, rbp = 0x7fffffffdae0 ---

That corresponds to kgdb:

#9  0xffffffff8044f6e4 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:624
#10 0xffffffff80687425 in swapoff_one (sp=0xfffffe000dc3dc00,
    cred=0xffffff808e5bf340) at /usr/src/sys/vm/swap_pager.c:1774
#11 0xffffffff8068789b in sys_swapoff (td=0xfffffe000e90f000,
uap=Variable "uap" is not available.
)
    at /usr/src/sys/vm/swap_pager.c:2236
#12 0xffffffff806c60c9 in amd64_syscall (td=0xfffffe000e90f000, traced=0)
    at subr_syscall.c:131
---Type <return> to continue, or q <return> to quit---
#13 0xffffffff806b1467 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:387

and

(kgdb) thr 108
[Switching to thread 108 (Thread 100199)]#0  cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1394
1394            CPU_SET_ATOMIC(cpu, &stopped_cpus);
(kgdb) bt
#0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394
#1  0xffffffff806bb540 in ipi_nmi_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1376
#2  0xffffffff806c7035 in trap (frame=0xffffffff80dc4d10)
    at /usr/src/sys/amd64/amd64/trap.c:200
#3  0xffffffff806b15bf in nmi_calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:501
#4  0xffffffff8043e0d0 in _mtx_unlock_flags (m=0xffffffff80d8af80, opts=0,
    file=0xffffffff80791e48 "/usr/src/sys/vm/swap_pager.c", line=2040)
    at /usr/src/sys/kern/kern_mutex.c:221
[smth. wrong there -- no further stack: swap_pager_*,  etc]

Here both swap_pager_swapoff() and swp_pager_meta_ctl() contend on
swhash_mtx. Or rather that's due to low limit set on retries counter?

Let's see again for another crash:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1062783, size: 4096
panic: swapoff: failed to locate 22133 swap blocks
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802e009a = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff80486d87 = kdb_backtrace+0x37
panic() at 0xffffffff8044f6ee = panic+0x2ee
swapoff_one() at 0xffffffff80687425 = swapoff_one+0x475
sys_swapoff() at 0xffffffff8068789b = sys_swapoff+0x1bb
amd64_syscall() at 0xffffffff806c60c9 = amd64_syscall+0x299
Xfast_syscall() at 0xffffffff806b1467 = Xfast_syscall+0xf7
--- syscall (424, FreeBSD ELF64, sys_swapoff), rip = 0x800ab307c, rsp
= 0x7fffffffd9c8, rbp = 0 ---
KDB: enter: panic
[ thread pid 2807 tid 100149 ]
Stopped at      0xffffffff8048696b = kdb_enter+0x3b:    movq
$0,0x735f72(%rip)

Hmm... now the remain all 3 cpus are on idle.

The hungry process is not on CPU now and just sleeping, nothing
interesting there:

Tracing command hungry pid 2707 tid 100077 td 0xfffffe00029a4900
sched_switch() at 0xffffffff804798f7 = sched_switch+0x167
mi_switch() at 0xffffffff80458faa = mi_switch+0x1ea
turnstile_wait() at 0xffffffff8049872d = turnstile_wait+0x27d
_mtx_lock_sleep() at 0xffffffff8043e500 = _mtx_lock_sleep+0x130
_mtx_lock_flags() at 0xffffffff8043e8ac = _mtx_lock_flags+0x16c
pmap_enter() at 0xffffffff806c3867 = pmap_enter+0x87
vm_fault_hold() at 0xffffffff8068f39c = vm_fault_hold+0x15bc
trap_pfault() at 0xffffffff806c6c26 = trap_pfault+0xe6
trap() at 0xffffffff806c733f = trap+0x5af
calltrap() at 0xffffffff806b1183 = calltrap+0x8
--- trap 0xc, rip = 0x4006ed, rsp = 0x7fffffffdad0, rbp = 0x7fffffffdae0 ---

However, a pagedaemon kernel process now stuck at the same place, although
it's at _mtx_lock_sleep unlike _mtx_unlock_flags from a previous crash.

Tracing command pagedaemon pid 4 tid 100059 td 0xfffffe00026b5900
sched_switch() at 0xffffffff804798f7 = sched_switch+0x167
mi_switch() at 0xffffffff80458faa = mi_switch+0x1ea
turnstile_wait() at 0xffffffff8049872d = turnstile_wait+0x27d
_mtx_lock_sleep() at 0xffffffff8043e500 = _mtx_lock_sleep+0x130
_mtx_lock_flags() at 0xffffffff8043e8ac = _mtx_lock_flags+0x16c
swp_pager_meta_ctl() at 0xffffffff8068413a = swp_pager_meta_ctl+0x7a
swap_pager_haspage() at 0xffffffff80684272 = swap_pager_haspage+0x42
vm_page_cache() at 0xffffffff806a133a = vm_page_cache+0x2ca
vm_pageout() at 0xffffffff806a40a1 = vm_pageout+0x1201
fork_exit() at 0xffffffff8041f965 = fork_exit+0x135
fork_trampoline() at 0xffffffff806b16ae = fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff80003c5d00, rbp = 0 ---

-- 
wbr,
pluknet