Interesting anomoly with a 2940UW

Tue Sep 9 18:03:33 PDT 1997

--------

> >As I think you noticed later on in the email, this is moot since we have 
> >interrupts disabled as we are grabbing the QOUTFIFO entries and putting them 
> >on our internal completion queue and we aren't doing completion processing 
> >here, so we should be well outrunning the sequencer.
> 
> But you mentioned that on your machines you are not running will all
> interrupts disabled which means that you are still open to this window.

We do have all interrupt disabled in that section.  It also does very 
little, kind of like your routine in that section, it grabs the tag, 
verifies it's a valid command, then sticks it on a completion queue for 
later processing.  All of this is done with interrupts off, entirely.

> >Also, since we loop 
> >until QOUTCNT goes to 0, we know that we've grabbed everything with only a 
> >*very* small window for one command to complete after we check (and 
> >according to the comments in the aic7xxx.c file, Dan structured the isr in 
> >an attempt to defeat this window, so maybe that doesn't even exist).
> 
> This isn't true if you don't run with all interrupt sources disabled.
> I never even contemplated that you would do that.

We do.

> But in the FreeBSD case, the completion processing is done from a software
> interrupt handler that can be preempted by our hardware interrupt handler.

Enabling interrupts during our completion area is similar in most aspects 
except for one.  Our completion area can get pre-empted by hardware 
interrupts from any source except ours, or more appropriately, as long as 
the aic7xxx_isr() function has not returned, and hardware interrupts that 
would be sent to that function are stopped, regarless of interrupts on or 
off.  Other interrupts are allowed to go on though, so a network card on 
another int will have no problem picking up its packets while we are doing 
completion, we just won't be able to clear our QOUTFIFO any during our 
completion.

> 
> >So, yes, 
> >the code is inneficient, but in a best case scenario, we should pause, 
> >write, unpause only once per interrupt.  In a worst case scenario we would 
> >pause, write, unpause twice in an interrupt (becuase a command completed 
> >while we were reading the qoutfifo).  Without the benefit of a PCI bus 
> >analyzer, it would seem to me that this is better than the lazy updates to 
> >the CMDOUTCNT register in the fasion that you use them.  However, I would 
> >agree with lazy updates if they were done something like this (something I 
> >thought about in between the time I wrote you and you responded):
> >
> >  p->cmdoutcnt += qoutcnt;
> >  .... do stuff ....
> >  if ((p->flags & PAGE_ENABLED) && (p->cmdoutcnt > (p->qfullcnt >> 1)))
> >  {
> >    outb(0, p->base + CMDOUTCNT);
> >  }
> >
> >At least this way, with our high latency, we wouldn't risk spin locking on 
> >only a few commands (unless the fifo depth was very small).  Instead, we 
> >would update the variable once we got half way full each time, and that 
> >would leave half of the real depth as an effective always correct space 
> >count.
> 
> But you've already shown that it can take sufficiently long for you to
> get back to servicing the queue that the queue is full, so having only
> half the count as a safe guard is not sufficient.  The count must be
> correct if you want to remove the window of vulnerability.

Hmmm....I believe that's what my strict CMDOUTCNT setting was attempting to 
do, so now are you saying that we need strict CMDOUTCNT checking under 
linux.  To be quite honest, I think most of the races you have mentioned in 
relation to my usage of the CMDOUTCNT variable exist not only in my code, 
but in yours as well.  The main difference being that linux *needs* this 
lock to work properly and efficiently, where as your testing under FreeBSD 
indicates you don't really need the lock, so it couldn't matter less in 
FreeBSD if the count is occasionally off by a number here or there.  In 
FreeBSD you aren't hitting the lock, so not keeping it fairly close to the 
actual QOUTCNT isn't that big of a deal, you never spin lock because of it.  
We do hit the lock, so the less accurate it is, the more efficiency we loose 
on the bus as the sequencer spin locks waiting on us.

> 
> >> I'm fully aware that CMDOUTCNT does not directly track the current state
> >> of the FIFO.  I wanted a lazy update as it means I only have to do a single
> >> write which can be done with AAP.  In order for your algorithm to work, you
> >> have to perform a read and a write with the sequencer paused and having 
> >> looked at what this does with a PCI bus analyzer, it's simply not worth
> >> it.
> >
> >Says who?  When we go through that code the sequencer is in one of three 
> >states.  One, running.  Two, spin locked for the CMDOUTCNT variable.  Three, 
> >paused for some other INT condition (seqint, scsiint).  If we are running 
> >and we write a 0 to CMDOUTCNT, then we've got from the time we write until 
> >after we've written to CLRINT for another command to complete.  If we hit 
> >the race window you are talking about, then we should re-run our loop as we 
> >read the QOUTCNT register, see we have another command, re-run the loop, 
> >re-write the CMDOUTCNT variable, race fixed because we simply wrote a 0 over 
> >a 0 while also emptying the QOUTFIFO.
> 
> Assuming that the sequencer unpauses fast enough to put the entry in the
> queue for you to see it.  The time it takes for the sequencer to restart
> execution isn't specified and I don't think you can make the assumption
> that this will work.  To put it another way, if you are so sure that the
> race cannot happen with the way you have it coded now, how did you overflow
> the QOUTFIFO???

This may be true, and it may be the very reason that we did overflow the 
QOUTFIFO.  However, as we've discussed in all of this, I think it is correct 
to say that the limit of the race window is a single command.  So, if I 
wanted to play it safe, I could always just output 1 instead of 0 to 
CMDOUTCNT and that would solve the race at the expense of a single 
instruction on the QOUTFIFO as oppossed to possibly a large number of 
commands using your update method.  Furhtermore, if I were going to do that, 
then I could delay any outb until we have actually left the loop so that it 
gaurantees only one write per iteration of the isr() and in a worse case 
scenario, I would output 1, we would be looping so when the sequencer was 
unpaused it would immediately inc that to two and place a command on the 
QOUTFIFO which we would catch on our next interrupt.

aka:
save_flags(processor_flags);
cli();
qoutcnt = inb(p->base + QOUTCNT) & p->qcntmask;
while (qoutcnt)
{
  for (i=0; i < qoutcnt; i++)
  {
    do our stuff
  }
  outb(CLRCMDINT, p->base + CLRINT);
  interrupts_cleared++;
  qoutcnt = inb(p->base + QOUTCNT) & p->qcntmask;
}
if (p->flags & PAGE_ENABLED)
{
  outb(1, p->base + CMDOUTCNT);
}

if (instat & SEQINT)
  aic7xxx_handle_seqint(p);
if (instat & SCSIINT)
  aic7xxx_handle_scsiint(p);
unpause_sequencer(p, FALSE); // We don't need to force the unpause since it
                             // only fails on seqint and scsiint, which we
                             // either just cleared, or already set again and
                             // we need to be left paused for.

restore_flags(processor_flags);  // In our case, turns interrupts back on
aic7xxx_done_cmds_complete(p);   // Does our completion processing

clean up stuff here

return;

Does that sound any more efficient (and race protected) to you?

-- 
*****************************************************************************
* Doug Ledford                      *   Unix, Novell, Dos, Windows 3.x,     *
* dledford at dialnet.net    873-DIAL  *     WfW, Windows 95 & NT Technician   *
*   PPP access $14.95/month         *****************************************
*   Springfield, MO and surrounding * Usenet news, e-mail and shell account.*
*   communities.  Sign-up online at * Web page creation and hosting, other  *
*   873-9000 V.34                   * services available, call for info.    *
*****************************************************************************