[patch] zfs livelock and thread priorities
Ben Kelly
ben at wanderview.com
Fri Apr 17 01:30:24 UTC 2009
On Apr 15, 2009, at 12:35 AM, Artem Belevich wrote:
> I'll give it a try in a few days. I'll let you know how it went.
Just FYI, I was able to reproduce some of the failures with the
original patch using an SMP vmware image. The new patch seems to fix
these problems and I was able to successfully mount a zfs pool.
> BTW, now that you're tinkering with ZFS threads and priorities, whould
> you by any chance have any idea why zfs scrub is so painfully slow on
> -current?
> When I start scrub on my -stable box, it pretty much runs full speed
> -- I can see disks under load all the time.
> However on -current scrub seems to run in small bursts. Disks get busy
> for a second or so and then things get quiet for about five seconds or
> so and this pattern repeats over and over.
I don't know. I haven't had to scrub my devices very often. I ran a
couple here locally and did not see the behavior you describe. There
is a significant delay between typing zpool scrub and when it actually
begins disk I/O, but after that it completes without pause. If I get
a chance I'll try to look at what the scrub code is doing.
Thanks again.
- Ben
> --Artem
>
>
>
> On Tue, Apr 14, 2009 at 7:32 PM, Ben Kelly <ben at wanderview.com> wrote:
>> On Apr 14, 2009, at 11:50 AM, Ben Kelly wrote:
>>>
>>> On Apr 13, 2009, at 7:36 PM, Artem Belevich wrote:
>>>>
>>>> Tried your patch that used PRIBIO+{1,2} for priorities with -
>>>> current
>>>> r191008 and the kernel died with "spinlock held too long" panic.
>>>> Actually, there apparently were two instances of panic on different
>>>> cores..
>>>>
>>>> Here's output of "alltrace" and "ps" after the crash:
>>>> http://pastebin.com/f140f4596
>>>>
>>>> I've reverted the change and kernel booted just fine.
>>>>
>>>> The box is quad-core with two ZFS pools -- one single-disk and
>>>> another
>>>> one is a two-disk mirror. Freebsd is installed on UFS partitions,
>>>> ZFS
>>>> is used for user stuff only.
>>>
>>> Thanks for the report!
>>>
>>> I don't have a lot of time to look at this today, but it appears
>>> that
>>> there is a race condition on SMP machines when setting the priority
>>> immediately after the kproc is spawned. As a quick hack I tried
>>> adding a
>>> pause between the kproc_create() and the sched_prio(). Can you
>>> try this
>>> patch?
>>>
>>>
>>> http://www.wanderview.com/svn/public/misc/zfs_livelock/zfs_thread_priority.diff
>>>
>>> I'll try to take a closer look at this later in the week.
>>
>> Sorry for replying to my own e-mail, but I've updated the patch
>> again with a
>> less hackish approach. (At the same URL above.) I added a new
>> kproc_create_priority() function to set the priority of the new
>> thread
>> before its first scheduled. This should avoid any SMP races with
>> setting
>> the priority from an external thread.
>>
>> If you would be willing to try the test again with this new patch I
>> would
>> appreciate it.
>>
>> Thanks!
>>
>> - Ben
>>
More information about the freebsd-current
mailing list