[patch] zfs livelock and thread priorities
ben at wanderview.com
Tue May 19 11:14:26 UTC 2009
On May 19, 2009, at 5:40 AM, Attilio Rao wrote:
> 2009/5/19 Ben Kelly <ben at wanderview.com>:
>> On May 18, 2009, at 1:38 PM, Attilio Rao wrote:
>>> This still doesn't explain priorities like 49 or such seen in the
>>> first report as long as we don't set priorities by hand,
>> I'm trying to understand why this particular priority value is so
>> concerning, but I'm a little bit confused. Can you elaborate on
>> why you
>> think its a problem? From previous off-list e-mails I get the
>> that you are concerned that it does not fall on an RQ_PPQ
>> boundary. Is this
>> the case? Again, I may be completely confused, but ULE does not
>> seem to
>> consider RQ_PPQ when it assigns priorities for interactive
>> threads. Here is
>> how I came to this conclusion:
> I'm concerned because the first starvation I saw in this thread was
> caused by the proprity lowered inappropriately (it was 49 on 45 IIRC).
> 49 means that the thread will never be choosen when the 45s are still
> in the runqueue. I'm not concerned on RQ_PPQ boundaries.
Ah, ok. Sorry for my confusion.
I guess the condition seemed somewhat reasonable to me because the
behavior of the 45s probably looks very interactive to the scheduler.
The user threads wake up, see that there is no space in the arc,
signal the txg threads, then sleep. The txg threads then wake up, see
that the spa_zio threads are not done, signal all the user threads,
then sleep. They bounce back and forth like this very quickly while
waiting for data to be flushed to the disk. (On my system this can
take a while since my backup pool is on a set of encrypted external
USB drives.) It seems likely that their runtime and sleeptime values
are balanced so the scheduler marks them as high priority interactive
So to me the interprocess communication within zfs appears to be
somewhat brain damaged in low memory conditions, but I do not think it
points to a problem in the scheduler. It seems that no matter what
algorithm the scheduler uses to determine interactivity an application
will be able to devise a perverse work load that will be misclassified.
Anyway, that was my rough guestimate of what was happening. If you
have time to do a more thorough analysis of the ktr dump that would be
great. Thanks again for your help!
More information about the freebsd-current