Questionable code in sys/dev/sound/pcm/channel.c
Don Lewis
truckman at FreeBSD.org
Mon Jul 26 15:33:57 PDT 2004
On 26 Jul, Conrad J. Sabatier wrote:
>
> On 26-Jul-2004 Don Lewis wrote:
>> On 26 Jul, Conrad J. Sabatier wrote:
>>> I'm a little perplexed at the following bit of logic in chn_write()
>>> (which is where the "interrupt timeout, channel dead" messages are
>>> being generated).
>>>
>>> Within an else branch within the main while loop, we have:
>>>
>>> else {
>>> timeout = (hz * sndbuf_getblksz(bs)) /
>>> (sndbuf_getspd(bs) * sndbuf_getbps(bs));
>>> if (timeout < 1)
>>> timeout = 1;
>>> timeout = 1;
>>>
>>> Why the formulaic calculation of timeout, if it's simply going to be
>>> unconditionally set to 1 immediately afterwards anyway? What's
>>> going on
>>> here?
>>
>> Hmn, looks bogus to me. I think the intention is to round timeout up
>> to 1 if the result of the formula is zero. The final assignment
>> statement looks bogus to me. Maybe a too short timeout is the
>> source of this problem.
>>
>> It looks like this assignment appeared in rev 1.65.
>
> Hmm, your guess is as good as (or probably better than) mine. :-)
> A little more in the way of comments certainly wouldn't hurt.
>
>>> Also, at the end of the function:
>>>
>>> if (count <= 0) {
>>> c->flags |= CHN_F_DEAD;
>>> printf("%s: play interrupt timeout, channel dead\n",
>>> c->name);
>>> }
>>>
>>> return ret;
>>> }
>>>
>>> Could it be that the conditional test is wrong here? Perhaps
>>> we should be using (count < 0) instead?
>>>
>>> I don't know. I'm having no small difficulty understanding this
>>> code, but these two items caught my attention.
>>
>> I ran into the same problem when I was looking at the code a few days
>> ago.
>>
>> BTW, the trace output that was posted showed write() returning 0
>> immediately before the failure occurred.
>
> Are you referring to the truss output I posted a few days ago? The
> thing of it is, though, that the original "channel dead" message had
> already occurred in a previous run of madplay (which wasn't traced), so
> it's really hard to say if there's any useful info to be obtained from
> tracing a later run, after the pcm device was already "broken".
I think that was it. The truss output looked like things were working
for a while before it croaked. I saw a bunch of writes succeed, then a
write returned 0, and then it looked like it died.
> So far, I still haven't gotten the error with the new kernel I'm
> testing. I wouldn't say absolutely that that single patch (of the
> final conditional test) is "the fix", but it may help in the meantime.
I just looked at the code some more. With timeout hardwired to 1, count
can never go negative. The code initializes count to hz, and then
decrements it whenever chn_sleep() returns EWOULDBLOCK, and
re-initializes count to hz if chn_sleep() returns zero. With timeout
hardwired to 1, count should only be able to decrement to zero if
chn_sleep() returns EWOULDBLOCK hz times in a row, which means that
nothing could be stuffed into the buffer for one second, which seems
like a long time ...
I suspect that with your change the write() call is returning a 0 and
the player software is doing a retry that succeeds (or this might be
audible as a skip).
More information about the freebsd-current
mailing list