Is anything being done re: the pcm timeout issue?
Rusty Nejdl
rnejdl at ringofsaturn.com
Tue Aug 10 20:29:41 PDT 2004
>>
>> And I have seen that these will eventually stop working one by one
>> until I have none left. lsof and fstat don't show any programs using
>> them, but nonetheless, programms like xmms and gaim can't use them
>> anymore.
Well, try as much as I could, I haven't been able to duplicate this tonight. I've got 4 vchans setup and I was running madplay continuously on 4 channels for 4 hours and it worked the whole time.
>
> The vchan code is fairly broken. I was hoping to have to some time to
> work on this (and other problems in the top half of the sound code) before
> 5.3, but it looks like the clock has just about run out.
I'm not seeing the locked channels yet, but that doesn't mean that they aren't there.
>
>
>> Do you have any more details on the pcm play timeout? Are you using
>> vchans? What program are you using?
>
> My suspicion is that there is either a problem in ich_intr() that it
> causing it to stop receiving interrupts or to stop calling chn_intr(), or
> there is enough interrupt latency to allow the DMA pointer to wrap and
> fool chn_dmaupdate() into thinking no data was consumed. It is possible
> that the ich_intr() problem is specific to amd64.
>
> I previously sent out these suggestions on how to debug the problem:
I remembered seeing these, but I'm learning as I go so that is a bit more than I can do at present.
>
>
> ------ Forwarded message ------
> From: Don Lewis <truckman at freebsd.org>
> Subject: Re: Questionable code in sys/dev/sound/pcm/channel.c
> Date: Tue, 27 Jul 2004 15:15:06 -0700 (PDT)
> To: mat at cnd.mcgill.ca
> Cc: freebsd-current at freebsd.org
>
>
> On 27 Jul, Mathew Kanner wrote:
>
>> On Jul 26, John-Mark Gurney wrote:
>>
>>> Conrad J. Sabatier wrote this message on Mon, Jul 26, 2004 at 16:35
>>> -0500:
>>>
>>>> Why the formulaic calculation of timeout, if it's simply going to
>>>> be unconditionally set to 1 immediately afterwards anyway? What's
>>>> going on here?
>>>
>>> Well, if you look at the annotations, that absolute set of timeout
>>> was added in rev 1.65 by cg with the comment: tweaks to reduce
>>> latency/pauses in output
>>>
>>
>>
>> I think this has been raised on the mailling list before.
>> IIRC, the logic for this is to check frequently for dead channels but
>> CG is the authoriy.
>>
>
> My suspicion is that this change was made to reduce the consequences of
> lost wakeups from the interrupt routine. This would have been more of a
> problem when tsleep() was used in chn_sleep() and shouldn't be needed now
> that the top and bottom halves of the code use the channel lock and
> chn_sleep() uses msleep() to atomically release the lock and wait for the
> wakeup from the interrupt code. That said, setting timeout to 1 shouldn't
> hurt anything and will just waste a bit of CPU time.
>
>
>>>> Also, at the end of the function:
>>>>
>>>>
>>>> if (count <= 0) { c->flags |= CHN_F_DEAD; printf("%s: play interrupt
>>>> timeout, channel dead\n", c->name); }
>>>>
>>>>
>>>> return ret; }
>>>>
>>>
>>> that was changed in rev1.52 (by cg also), and previously was just a
>>> check for count == 0..
>>>
>>> So, I'd recommend a message off to cg and ask why he made this
>>> changes...
>
> The original version of the code always set timeout to 1 and looped on
> (count > 0), so count could never go negative. When the code was
> changed to set count to something larger than 1, count could go negative if
> (hz % timeout != 0), so the condition for setting CHN_F_DEAD had to
> be modified accordingly.
>
> My suspicion is that there is sometimes enough latency in executing the
> interrupt routine that the hardware DMA pointer is wrapping and
> chn_dmaupdate() is calculating delta as zero. This would cause
> chn_wrfeed() not to consume any data from the software buffer (and skip
> the wakeup()), which might be enough to cause the chn_write() to time out
> while waiting for space to become available in the software buffer. It
> would be interesting to enable the debug code in chn_dmaupdate(), and add
> (delta == 0) as a condition to trigger the device_printf().
>
>
> The bigger question is what is the cause of the latency ...
>
>
>
> ------ Forwarded message ------
> From: Don Lewis <truckman at freebsd.org>
> Subject: Re: Questionable code in sys/dev/sound/pcm/channel.c
> Date: Tue, 27 Jul 2004 15:21:57 -0700 (PDT)
> To: conrads at cox.net
> Cc: freebsd-current at freebsd.org
>
>
> On 27 Jul, Conrad J. Sabatier wrote:
>
>>
>> On 26-Jul-2004 Conrad J. Sabatier wrote:
>>
>>>
>>> On 26-Jul-2004 Conrad J. Sabatier wrote:
>>>
>>>> I'm a little perplexed at the following bit of logic in chn_write()
>>>> (which is where the "interrupt timeout, channel dead" messages are
>>>> being generated).
>>
>> [snip]
>>
>>
>>>> Also, at the end of the function:
>>>>
>>>>
>>>> if (count <= 0) { c->flags |= CHN_F_DEAD; printf("%s: play interrupt
>>>> timeout, channel dead\n", c->name); }
>>>>
>>>>
>>>> return ret; }
>>>>
>>>>
>>>> Could it be that the conditional test is wrong here? Perhaps
>>>> we should be using (count < 0) instead?
>>>
>>> I'm now running a kernel built with this last conditional test
>>> changed to "if (count < 0)" and sound is still working OK. Have yet to
>>> see if this eliminates the interrupt timeout messages.
>>
>> Well, that was a failure. :-) Didn't see any timeout error messages,
>> but the device still died eventually, nonetheless. I've since changed
>> back to the original code.
>
> That's an interesting data point. At this point I'd start looking at the
> driver code for your sound hardware. I suspect that the driver interrupt
> code is either no longer seeing interrupts, or it is no longer calling
> chn_intr().
>
>
>
More information about the freebsd-current
mailing list