[patch] zfs livelock and thread priorities
Ben Kelly
ben at wanderview.com
Thu Apr 30 01:56:21 UTC 2009
On Apr 29, 2009, at 7:47 PM, Lawrence Stewart wrote:
> Ben Kelly wrote:
>> On Apr 29, 2009, at 7:58 AM, Ben Kelly wrote:
>>> On Apr 29, 2009, at 2:43 AM, Jaakko Heinonen wrote:
>>>> On 2009-04-28, Ben Kelly wrote:
>>>>>> http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff
>>>>>
>>>>> The patch is updated in the same location above.
>>>>
>>>> There's a fatal typo in the patch:
>>>>
>>>> - ZFS_OBJ_HOLD_ENTER(zfsvfs, z_id);
>>>> + locked == ZFS_OBJ_HOLD_TRYENTER(zfsvfs, z_id);
>>>> ^^^^
>>>
>>> Yikes! Thanks for catching this!
>>>
>>> The patch has been updated at the same URL. If anyone has patched
>>> their system please grab the new version. Sorry for the confusion.
>> Argh! The patch was still broken even after this fix.
>> Apparently when I tested my taskqueue solution I forgot to do a
>> make installkernel. For some reason the taskqueue approach
>> deadlocks my server at home under normal conditions. Therefore I
>> have reverted the patch to use the simple return. I still don't
>> think this is the right solution, but I don't have time to
>> completely figure out what is going on right now.
>> Again, sorry for the mess!
>
> As far as I can tell, one of the developers is working on a patch to
> address the same issue you're discussing in this thread. He ran into
> it on his SSD ZFS installation and the symptoms sound likely to be
> the same as what you're discussing. I believe he's testing a patch
> which is inspired by the one the opensolaris guys used to fix the
> bug, which you can look at here:
>
> http://people.freebsd.org/~pjd/patches/vn_rele_hang.patch
>
> The open solaris one has major incompatibilities with FreeBSD so
> can't be applied directly.
>
> As soon as it's ready I think he'll be making it available for wider
> testing so stay tuned.
>
> Cheers,
> Lawrence
>
> PS Apologies if the issue you're working on is not the same as the
> one addressed by the opensolaris patch above.
Thank you! This does appear to be the same issue and I look forward
to seeing the final fix.
For now I've gone ahead and updated my patch with a naive adaptation
of the opensolaris diff. It seems more correct than what I had and I
was worried people would waste time testing my broken approach. I've
only been able to test it on my i386, non-SMP server however.
Thanks again.
- Ben
More information about the freebsd-current
mailing list