Interesting anomoly with a 2940UW

Tue Sep 9 17:20:17 PDT 1997

>> qoutcnt <- QOUTCNT == 5
>> Process 5 commands in interrupt handler    10 Commands complete in sequencer
>> Set CMDOUTCNT to 0 should be 10
>> 
>> qoutcnt <- QOUTCNT == 10
>> Process 10 commands			    8 Commands complete in sequencer
>> 	OVERFLOW!!!!
>
>As I think you noticed later on in the email, this is moot since we have 
>interrupts disabled as we are grabbing the QOUTFIFO entries and putting them 
>on our internal completion queue and we aren't doing completion processing 
>here, so we should be well outrunning the sequencer.

But you mentioned that on your machines you are not running will all
interrupts disabled which means that you are still open to this window.

>Also, since we loop 
>until QOUTCNT goes to 0, we know that we've grabbed everything with only a 
>*very* small window for one command to complete after we check (and 
>according to the comments in the aic7xxx.c file, Dan structured the isr in 
>an attempt to defeat this window, so maybe that doesn't even exist).

This isn't true if you don't run with all interrupt sources disabled.
I never even contemplated that you would do that.

>> I never said that you don't have high interrupt latency.  What I said was
>> that I don't have high interrupt latency, but of course, I don't run
>> Linux.  In my system, the hardware interrupt handler for the aic7xxx card
>> simply removes the entry from the QOUTFIFO, sets a few status bits in
>> the generic SCSI structure associated with this transaction and queues it
>> to a software interrupt handler.
>
>Which is exactly what we do in that particular section of code, we do the 
>call to scsi_done later on, which is where our latency comes from.

But in the FreeBSD case, the completion processing is done from a software
interrupt handler that can be preempted by our hardware interrupt handler.

>So, yes, 
>the code is inneficient, but in a best case scenario, we should pause, 
>write, unpause only once per interrupt.  In a worst case scenario we would 
>pause, write, unpause twice in an interrupt (becuase a command completed 
>while we were reading the qoutfifo).  Without the benefit of a PCI bus 
>analyzer, it would seem to me that this is better than the lazy updates to 
>the CMDOUTCNT register in the fasion that you use them.  However, I would 
>agree with lazy updates if they were done something like this (something I 
>thought about in between the time I wrote you and you responded):
>
>  p->cmdoutcnt += qoutcnt;
>  .... do stuff ....
>  if ((p->flags & PAGE_ENABLED) && (p->cmdoutcnt > (p->qfullcnt >> 1)))
>  {
>    outb(0, p->base + CMDOUTCNT);
>  }
>
>At least this way, with our high latency, we wouldn't risk spin locking on 
>only a few commands (unless the fifo depth was very small).  Instead, we 
>would update the variable once we got half way full each time, and that 
>would leave half of the real depth as an effective always correct space 
>count.

But you've already shown that it can take sufficiently long for you to
get back to servicing the queue that the queue is full, so having only
half the count as a safe guard is not sufficient.  The count must be
correct if you want to remove the window of vulnerability.

>> I'm fully aware that CMDOUTCNT does not directly track the current state
>> of the FIFO.  I wanted a lazy update as it means I only have to do a single
>> write which can be done with AAP.  In order for your algorithm to work, you
>> have to perform a read and a write with the sequencer paused and having 
>> looked at what this does with a PCI bus analyzer, it's simply not worth
>> it.
>
>Says who?  When we go through that code the sequencer is in one of three 
>states.  One, running.  Two, spin locked for the CMDOUTCNT variable.  Three, 
>paused for some other INT condition (seqint, scsiint).  If we are running 
>and we write a 0 to CMDOUTCNT, then we've got from the time we write until 
>after we've written to CLRINT for another command to complete.  If we hit 
>the race window you are talking about, then we should re-run our loop as we 
>read the QOUTCNT register, see we have another command, re-run the loop, 
>re-write the CMDOUTCNT variable, race fixed because we simply wrote a 0 over 
>a 0 while also emptying the QOUTFIFO.

Assuming that the sequencer unpauses fast enough to put the entry in the
queue for you to see it.  The time it takes for the sequencer to restart
execution isn't specified and I don't think you can make the assumption
that this will work.  To put it another way, if you are so sure that the
race cannot happen with the way you have it coded now, how did you overflow
the QOUTFIFO???

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================