Interesting anomoly with a 2940UW

Doug Ledford dledford at dialnet.net
Tue Sep 9 16:41:10 PDT 1997


--------
> >Wouldn't matter.  If we did pause things here, then when we unpaused them, 
> >the QOUTCNT register would get incremented as we are writing CLRCMDINT to 
> >CLRINT, then we would check QOUTCNT again, it would be non-zero, so we would 
> >re-run the loop, and we would re-write the CMDOUTCNT variable again.
> 
> Sure, but what causes high interrupt latency, Doug?  Other interrupt
> handlers running, or your interrupt handler taking a long time, is what
> causes it.  In FreeBSD, your interrupt handler can be interrupted by any
> non masked interrupts which means, during that window, you could easily be
> diverted from running your loop perhaps long enough for multiple commands
> to pile up which might just be long enough for you to overflow the qoutfifo
> before your interrupt handler is resumed and you can complete your work.
> And, since the sequencer might have stuffed multiple commands into the
> QOUTFIFO since you last read it, the variance in what you write to
> CMDOUTCNT and how full the fifo is could be quite large.  For example:
> 
> qoutcnt <- QOUTCNT == 5
> Process 5 commands in interrupt handler    10 Commands complete in sequencer
> Set CMDOUTCNT to 0 should be 10
> 
> qoutcnt <- QOUTCNT == 10
> Process 10 commands			    8 Commands complete in sequencer
> 	OVERFLOW!!!!

As I think you noticed later on in the email, this is moot since we have 
interrupts disabled as we are grabbing the QOUTFIFO entries and putting them 
on our internal completion queue and we aren't doing completion processing 
here, so we should be well outrunning the sequencer.  Also, since we loop 
until QOUTCNT goes to 0, we know that we've grabbed everything with only a 
*very* small window for one command to complete after we check (and 
according to the comments in the aic7xxx.c file, Dan structured the isr in 
an attempt to defeat this window, so maybe that doesn't even exist).


> Okay.  I'll tell you again.  It's inefficient code.  In the example
> you site, you're only able to fill the QOUTFIFO 56 times after performing
> how many transactions???  Probably a few hundred thousand on a busy news
> server if not more.
> 
> I never said that you don't have high interrupt latency.  What I said was
> that I don't have high interrupt latency, but of course, I don't run
> Linux.  In my system, the hardware interrupt handler for the aic7xxx card
> simply removes the entry from the QOUTFIFO, sets a few status bits in
> the generic SCSI structure associated with this transaction and queues it
> to a software interrupt handler.

Which is exactly what we do in that particular section of code, we do the 
call to scsi_done later on, which is where our latency comes from.  So, yes, 
the code is inneficient, but in a best case scenario, we should pause, 
write, unpause only once per interrupt.  In a worst case scenario we would 
pause, write, unpause twice in an interrupt (becuase a command completed 
while we were reading the qoutfifo).  Without the benefit of a PCI bus 
analyzer, it would seem to me that this is better than the lazy updates to 
the CMDOUTCNT register in the fasion that you use them.  However, I would 
agree with lazy updates if they were done something like this (something I 
thought about in between the time I wrote you and you responded):

  p->cmdoutcnt += qoutcnt;
  .... do stuff ....
  if ((p->flags & PAGE_ENABLED) && (p->cmdoutcnt > (p->qfullcnt >> 1)))
  {
    outb(0, p->base + CMDOUTCNT);
  }

At least this way, with our high latency, we wouldn't risk spin locking on 
only a few commands (unless the fifo depth was very small).  Instead, we 
would update the variable once we got half way full each time, and that 
would leave half of the real depth as an effective always correct space 
count.


> Its a shame that Linux doesn't offer a decent software interrupt strategy,
> but that's not my problem.  You should still be able to get decent latency
> for setting the CMDOUTCNT back to 0 if you clear the QOUTFIFO first,
> putting entries into a list, setting CMDOUTCNT to zero, then processing
> the entries on the list.  You are probably getting into your interrupt
> handler plenty fast, but getting crushed by the overhead of generic SCSI
> processing at interrupt time.

Getting in, getting out.  It doesn't matter.  If our interrupt handler gets 
in plenty fast to grab things, but then gets delayed in our tail end 
execution, then we still block the sequencer until we can get out and get 
re-entered.


> Wow.  I never knew that you used to run your interrupt handler with all 
> other interrupts disabled.  Don't your network servers drop packets like
> crazy when you do this?

Ummm...there are assorted problems under very high load, but fortunately in 
my case anyway, the network card I use has a rather large rx ring buffer 
that is accessed via DMA, so it tends to survive (or if it does drop 
packets, it doesn't say anything).  However, the change I mentioned in 
regards to enabling interrupts specifically during the completion processing 
has a good deal of impact on that situation.

> 
> >The second reason I wrote it that way is because of this.  Let's say your 
> >code answers an interrupt with two commands on the QOUTFIFO, and p->
> >cmdoutcnt == 12, then cmdoutcnt will get incremented to 14 while the 
> >QOUTFIFO goes to zero.  Now, if the next interrupt has a high latency, then 
> >you may end up using that spin lock far before you ever reach the QOUTFIFO 
> >depth since you didn't update the CMDOUTCNT variable during the last isr.  
> >So, which is more inneficient, allowing a high latency interrupt to block 
> >with only a command or two complete, or writing out the actual CMDOUTCNT on 
> >each interrupt routine when we are already writing to the card?  Keep in 
> >mind the interrupt latency that we see sometimes.
> 
> I'm fully aware that CMDOUTCNT does not directly track the current state
> of the FIFO.  I wanted a lazy update as it means I only have to do a single
> write which can be done with AAP.  In order for your algorithm to work, you
> have to perform a read and a write with the sequencer paused and having 
> looked at what this does with a PCI bus analyzer, it's simply not worth
> it.

Says who?  When we go through that code the sequencer is in one of three 
states.  One, running.  Two, spin locked for the CMDOUTCNT variable.  Three, 
paused for some other INT condition (seqint, scsiint).  If we are running 
and we write a 0 to CMDOUTCNT, then we've got from the time we write until 
after we've written to CLRINT for another command to complete.  If we hit 
the race window you are talking about, then we should re-run our loop as we 
read the QOUTCNT register, see we have another command, re-run the loop, 
re-write the CMDOUTCNT variable, race fixed because we simply wrote a 0 over 
a 0 while also emptying the QOUTFIFO.  If we are spin locking, then when we 
write the variable and unpause, we end up nearly immediately writing to 
QOUTFIFO in the sequencer, we catch that in QOUTCNT (since there is a delay 
as we write to CLRINT) and we re-run the loop.  If we are paused for a 
seqint or scsiint, then we don't unpause, we aren't near a command 
completion, race window doesn't exist.  Now, without a PCI analyzer to guide 
me on this, I could be wrong, but it seems to me that as small as the race 
window is that you pointed out in the sequencer, if we hit that race window, 
the extra check of the actual QOUTCNT register a few lines later after 
having written to CLRINT should catch that race.  The only way for it to 
miss is if we are able to complete the unpause_sequencer(); outb(CLRCMDINT, 
p->base + CLRINT); interrupts_cleard++; inb(p->base + QOUTCNT); faster than 
the sequencer can do a mov QOUTFIFO, SCB_TAG;  This is if we happended to 
pause the sequencer right after the inc CMDOUTCNT; statement.  The other 
possible race is if the sequencer is spin locked, but then it does the inc 
after we have written to CMDOUTCNT, so that isn't really a race at all.  
That's why I don't bother to re-read the QOUTCNT register, because if it 
isn't 0, then we are going to re-run the loop anyway.

> >Also, who's to say the 
> >reason you don't see messages about the QOUTCNT isn't due to this very 
> >condition instead of interrupt latency?  A better test to see if this 
> >algorithm does what you want would be not to check and print a message about 
> >the QOUTFIFO depth, but check to see if your sequencer is spin locking on 
> >CMDOUTCNT and holding up the bus.
> 
> Actually, I incremented a count in sequencer scratch ram for every time I
> hit the lock.  Either every time I went to look it had wrapped to 0 or my
> lock was never hit.  As I said before, you are probably getting into
> your interrupt handler plenty fast, it's just that your interrupt handler
> runs for a long time before you go back and clean out the queue.

That's good for BSD, but I suspect that if you checked that lock under 
linux, it would be incrementing.  Our basic flow of the isr is like this:

handle cmdcmplt interrupts
handle seqint
handle scsiint
 (sequencer should be unpaused at this point)
enable interrupts again
run completion processing
exit isr

While we are doing the completion processing, the kernel won't allow our isr 
to be re-entrant, so that's the cause of our latency, but regardless, it's 
still an occasionally long time before we get around to re-entering ourself 
and cleaning the queue out again.

-- 
*****************************************************************************
* Doug Ledford                      *   Unix, Novell, Dos, Windows 3.x,     *
* dledford at dialnet.net    873-DIAL  *     WfW, Windows 95 & NT Technician   *
*   PPP access $14.95/month         *****************************************
*   Springfield, MO and surrounding * Usenet news, e-mail and shell account.*
*   communities.  Sign-up online at * Web page creation and hosting, other  *
*   873-9000 V.34                   * services available, call for info.    *
*****************************************************************************





More information about the aic7xxx mailing list