[patch] zfs livelock and thread priorities
Adam McDougall
mcdouga9 at egr.msu.edu
Sat May 16 03:13:34 UTC 2009
On Tue, Apr 28, 2009 at 04:52:23PM -0400, Ben Kelly wrote:
On Apr 28, 2009, at 2:11 PM, Artem Belevich wrote:
> My system had eventually deadlocked overnight, though it took much
> longer than before to reach that point.
>
> In the end I've got many many processes sleeping in zio_wait with no
> disk activity whatsoever.
> I'm not sure if that's the same issue or not.
>
> Here are stack traces for all processes -- http://pastebin.com/f364e1452
> I've got the core saved, so if you want me to dig out some more info,
> let me know if/how I could help.
It looks like there is a possible deadlock between zfs_zget() and
zfs_zinactive(). They both acquire a lock via ZFS_OBJ_HOLD_ENTER().
The zfs_zinactive() path can get called indirectly from within
zio_done(). The zfs_zget() can in turn block waiting for zio_done()'s
completion while holding the object lock.
The following patch might help:
http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff
This simply bails out of the inactive processing if the object lock is
already held. I'm not sure if this is 100% correct or not as it
cannot verify there are references to the vnode. I also tried
executing the zfs_zinactive() logic in a taskqueue to avoid the
deadlock, but that caused other deadlocks to occur.
Hope that helps.
- Ben
Its my understanding that the deadlock was fixed in -current,
how does that affect the usefulness of the thread priorities
patch? Should I continue testing it or is it effectively a
NOOP now?
Also, I've been doing some fairly intense testing of zfs in
recent -current and I am tracking down a situation where
performance gets worse but I think I found a workaround.
I am gathering more data regarding the cause, workaround,
symptoms, and originating commit and will post about it soon.
More information about the freebsd-current
mailing list