devfs deadlock

Alex Unleashed unledev at gmail.com
Fri Aug 18 19:28:16 UTC 2006


Hello,

Before anything else I'd like to say I'm working on a SoC project for
Gentoo for which I have to port a sandbox-like application wrapper (in
userspace) to FreeBSD, which deals with the building and installing of
software so that nothing gets screwed in the filesystem. It's
finished, but unfortunately it gets frozen at random points,
preventing the system from launching new programs or saving to disk,
which suggested a kernel bug, as was later confirmed looking at the
waitchannels and debugging. I know there is someone working out issues
in devfs code in 6.x, so this might also be interesting to him.

I've been able to reproduce both in 6.1-RELEASE-p3 and 6-STABLE
(snapshot from August 16th ~01:00 GMT) a deadlock in devfs code which
leaves the system unable to access the disk. I've come up with some
interesting debugging info, and it looks to me like there are vnode
problems while a sx lock is being held.

My take at it is that the deadlock occurs when a process gets a lock
on a vnode (tagged "devfs") and another process xlocks an sx lock
("devfsmount"). For some reason the one holding the sx lock wants to
get the lock on the vnode through devfs_allocv(), and the other
process wants to get the sx lock through devfs_lookup(). From this
point on, pretty much anything wanting to touch the filesystem waits
forever on devfs_root() for another vnode flagged as VV_ROOT and
locked by the process holding the sx lock.

Patching the devfs code with fixes from -CURRENT didn't work out.

Here's some info from a DDB session after the deadlock had just occurred:

db> ps
  pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
17072 c4145624   0  17070  1071 0004002 [SLPQ piperd
0xc39904c8][SLP] gawk-3.1.5
17071 c4149c48   0  17070  1071 0000002 [SLPQ devfs      0xc41435c8][SLP] sh
17070 c4145830   0  17068  1071 0000002 [SLPQ wait       0xc4145830][SLP] sh
17069 c414920c   0  17067  1071 0004002 [SLPQ devfsmount 0xc396d540][SLP] xgcc
<<snip>>

db> trace 17069
Tracing pid 17069 tid 100071 td 0xc4146600
sched_switch(c4146600,0,1,c09545dc,c4146600) at sched_switch+0x2ea
mi_switch(1,0,c0955f54,1ba,c091b594) at mi_switch+0x425
sleepq_switch(c396d540,0,c0955f54,20c,dad529f0) at sleepq_switch+0x110
sleepq_wait(c396d540,c0a36da0,c0918064,1,c09c1b6c) at sleepq_wait+0x65
cv_wait(c396d540,c0a36da0,c094f9cd,23f,dad52ad4) at cv_wait+0x1e6
_sx_xlock(c396d510,c094f9c4,23f,c09b8d80,dad52ad4) at _sx_xlock+0x99
devfs_lookup(dad52ad4,c092ddcb,37,c096da3c,c4143570) at devfs_lookup+0x36
VOP_LOOKUP_APV(c09b8d80,dad52ad4,c4146600,c4146600,0) at VOP_LOOKUP_APV+0xa4
lookup(dad52b80,0,c09204ab,b6,c070487d) at lookup+0x4ed
namei(dad52b80,c091c646,275,c09550c4,c4146600) at namei+0x43c
kern_lstat(c4146600,807a800,0,dad52c54,c091b47f) at kern_lstat+0x4f
lstat(c4146600,dad52d04,3c4,c096ba20,c4146600) at lstat+0x2f
syscall(3b,3b,3b,28232cb2,bfbfa2c1) at syscall+0x31f
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (190, FreeBSD ELF32, lstat), eip = 0x282251bb, esp =
0xbfbf9e7c, ebp = 0xbfbfab38 ---

db> trace 17071
Tracing pid 17071 tid 100076 td 0xc394cd80
sched_switch(c394cd80,0,1,c09545dc,c394cd80) at sched_switch+0x2ea
mi_switch(1,0,c0955f54,1ba,c091b594) at mi_switch+0x425
sleepq_switch(c41435c8,0,c0955f54,20c,d9bb6768) at sleepq_switch+0x110
sleepq_wait(c41435c8,0,c091bebc,c8,c394cd80) at sleepq_wait+0x65
msleep(c41435c8,c0a3743c,50,c0917eb4,0) at msleep+0x342
acquire(d9bb6840,80,60000,b1,0) at acquire+0x11c
lockmgr(c41435c8,2002,c4143638,c394cd80,d9bb6888) at lockmgr+0x576
vop_stdlock(d9bb68b0,c08f40c3,42af,c091d728,1) at vop_stdlock+0x32
VOP_LOCK_APV(c09b8d80,d9bb68b0,c4143638,8,c0920936) at VOP_LOCK_APV+0xa4
vn_lock(c4143570,2002,c394cd80,7a7,2) at vn_lock+0xd2
vget(c4143570,2,c394cd80,c3972005,c396c460) at vget+0xf0
devfs_allocv(c396c400,c3889800,d9bb6b94,c394cd80,d9bb6988) at devfs_allocv+0x43
devfs_lookupx(d9bb6ad4,c094f9c4,23f,c09b8d80,d9bb6ad4) at devfs_lookupx+0x5f5
devfs_lookup(d9bb6ad4,c092ddcb,37,c096dac3,c3971000) at devfs_lookupx+0x3e
VOP_LOOKUP_APV(c09b8d80,d9bb6ad4,c394cd80,c394cd80,0) at VOP_LOOKUP_APV+0xa4
lookup(d9bb6b80,0,c09204ab,b6,c0a82d70) at lookup+0x4ed
namei(d9bb6b80,c091c646,275,c09550c4,c394cd80) at namei+0x43c
kern_lstat(c394cd80,806d800,0,d9bb6c54,c091b47f) at kern_lstat+0x4f
lstat(c394cd80,d9bb6d04,3c4,c096ba20,c394cd80) at lstat+0x2f
syscall(3b,3b,3b,281b2cb2,bfbf9731) at syscall+0x31f
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (190, FreeBSD ELF32, lstat), eip = 0x281a51bb, esp =
0xbfbf92ec, ebp = 0xbfbf9fa8 ---

db> show alllocks
Process 17071 (sh) thread 0xc394cd80 (100076)
exclusive sx devfsmount r = 0 (0xc396d510) locked @ fs/devfs/devfs_vnops:575
Process 34 (irq1: atkbd0) thread 0xc3687a80 (100031)
exclusive sleep mutex Giant r = 0 (0xc0a37ac0) locked @ kern/kern_intr.c:683

db> show lock 0xc396d510
 class: sx
 name: devfsmount
 state: XLOCK: 0xc394cd80 (tid 100076, pid 17071, "sh")
 waiters: 0 shared, 1 exclusive

db> show lockedvnods
Locked vnodes

0xc3971000: tag devfs, type VDIR
    usecount 3, writecount 0, refcount 3 mountedhere 0
    flags (VV_ROOT)
     lock type devfs: EXCL (count 1) by thread 0xc394cd80 (pid 17071)
#0 0xc06903a1 at lockmgr+0x621
#1 0xc06f98db at vop_stdlock+0x32
#2 0xc08e18fb at VOP_LOCK_APV+0xa4
#3 0xc0712460 at vn_lock+0xd2
#4 0xc07040ff at vget+0xf0
#5 0xc0645702 at devfs_allocv+0x43
#6 0xc064550e at devfs_root+0x2c
#7 0xc06fc023 at lookup+0x829
#8 0xc06fb521 at namei+0x43c
#9 0xc070ba21 at kern_lstat+0x4f
#10 0xc070b9ad at lstat+0x2f
#11 0xc08cbb89 at syscall+0x31f
#12 0xc08b890f at Xint0x80_syscall+0x1f

0xc4143570: tag devfs, type VDIR
    usecount 1, writecount 0, refcount 2 mountedhere 0
    flags ()
     lock type devfs: EXCL (count 1) by thread 0xc4146600 (pid 17069)
with 1 pending
#0 0xc06903a1 at lockmgr+0x621
#1 0xc06f98db at vop_stdlock+0x32
#2 0xc08e18fb at VOP_LOCK_APV+0xa4
#3 0xc0712460 at vn_lock+0xd2
#4 0xc07040ff at vget+0xf0
#5 0xc0645702 at devfs_allocv+0x43
#6 0xc064680e at devfs_lookupx+0x5f5
#7 0xc064685b at devfs_lookup+0x3e
#8 0xc08dea25 at VOP_LOOKUP_APV+0xa4
#9 0xc06fbce7 at lookup+0x4ed
#10 0xc06fb521 at namei+0x43c
#11 0xc070ba21 at kern_lstat+0x4f
#12 0xc070b9ad at lstat+0x2f
#13 0xc08cbb89 at syscall+0x31f
#14 0xc08b890f at Xint0x80_syscall+0x1f

After this the system still works, but it's unable to perform
operations on the filesystem, like launching a new process. When this
happens, the new process calls at some point a syscall (like open or
chflags, for example) which ends up at devfs_root() and the vnode with
the VV_ROOT flag gets its refcount and pending status incremented,
having the process waiting there forever. And that's what there is to
it.

Unfortunately I am not experienced enough with the kernel to propose a
patch, but I can reproduce this and perform whatever you'd want me to
in order to fix it.

Thanks,

Alex


More information about the freebsd-hackers mailing list