pthread_cond_timedwait() broken in 9-stable? (from JAN 10)

David Xu listlog2011 at gmail.com
Fri Feb 17 01:56:48 UTC 2012


On 2012/2/17 8:42, Julian Elischer wrote:
> Adding David Xu for his thoughts since he reqrote the code in quesiton 
> in revision 213098
>
> On 2/16/12 2:57 PM, Julian Elischer wrote:
>> On 2/16/12 1:06 PM, Julian Elischer wrote:
>>> On 2/16/12 9:34 AM, Andriy Gapon wrote:
>>>> on 15/02/2012 23:41 Julian Elischer said the following:
>>>>> The program fio (an IO test in ports) uses pthreads
>>>>>
>>>>> the following code (from fio-2.0.3, but its in earlier code too)
>>>>> has suddenly started misbehaving.
>>>>>
>>>>>          clock_gettime(CLOCK_REALTIME,&t);
>>>>>          t.tv_sec += seconds + 10;
>>>>>
>>>>>          pthread_mutex_lock(&mutex->lock);
>>>>>
>>>>>          while (!mutex->value&&  !ret) {
>>>>>                  mutex->waiters++;
>>>>>                  ret = 
>>>>> pthread_cond_timedwait(&mutex->cond,&mutex->lock,&t);
>>>>>                  mutex->waiters--;
>>>>>          }
>>>>>
>>>>>          if (!ret) {
>>>>>                  mutex->value--;
>>>>>                  pthread_mutex_unlock(&mutex->lock);
>>>>>          }
>>>>>
>>>>>
>>>>> It turns out that 'ret' sometimes comes back instantly (on my 
>>>>> machine) with a
>>>>> value of 60 (ETIMEDOUT)
>>>>> despite the fact that we set the timeout 10 seconds into the future.
>>>>>
>>>>> Has anyone else seen anything like this?
>>>>> (and yes the condition variable attribute have been set to use the 
>>>>> REALTIME clock).
>>>> But why?
>>>>
>>>> Just a hypothesis that maybe there is some issue with time keeping 
>>>> on that system.
>>>> How would that code work out for you with MONOTONIC?
>>>
>>> Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, 
>>> and they both had the same problem..
>>> i.e. random early returns with ETIMEDOUT.
>>>
>>> I think we will try move out machine forward to a newer -stable to 
>>> see if it resolves.
>> Kan upgraded the machine today to today's 9.x branch tip and the 
>> problem still occurs.
>> 8.x does not have this problem.
>>
>> I have not got a 9-RELEASE machine to test on.. so I can not tell if 
>> this came in with the burst of stuff
>> that came in after the 9.x branch was unfrozen after the release of 9.0.
>>
>>
>
I am trying to reproduce the problem,  do you have complete sample code 
to test ?

Regards,
David Xu



More information about the freebsd-threads mailing list