IF_HANDOFF vs. IFQ_HANDOFF

Tue Jun 20 09:54:26 UTC 2006

On Tue, Jun 20, 2006 at 05:11:18PM +1000, Bruce Evans wrote:
 > On Mon, 19 Jun 2006, Pyun YongHyeon wrote:
 > 
 > Please trim quotes.
 > 
 > >On Mon, Jun 19, 2006 at 06:04:26PM +1000, Bruce Evans wrote:
 > 
 > >> To max out the link without unmaxing CPU for other uses, you do have
 > >> to know when the tx approaches running out of packets.  This is best
 > >> done using watermark stuff.  There should be a nearly-complete interrupt
 > >> at low water, and (only after low water is reached and the interrupt
 > >> handler doesn't refill the tx ring to be above low water again) a
 > >> completion interrupt at actual completion.  My version of the sk driver
 > >> does this.  It arrange for the nearly-complete interrupt at about 32
 > >> fragments (min 128 uS) before the tx runs dry, and no other tx interrupts
 > >> unless the queue length stays below 32, while the -current driver gets
 > >> an interrupt after every packet.  It does this mainly to reduce the
 > >> tx interrupt load from 1 per packet to (under load) 1 per 480 fragments.
 > >> The correct handling of OACTIVE is obtained as a side effect almost
 > >> automatically.  ...
 > >>
 > >> I'm not very familiar with NIC hardware and don't know how other NICs
 > >> support timing of tx interrupts, but watermark stuff like the above
 > >> is routine for serial devices/drivers.  sk's support for interrupting
 > >> on any fragment is too flexible to be good (it is painful to program,
 > >> and there doesn't seem to be a good way to time out if there is no
 > >> good fragment to interrupt on or when you program the interruption on
 > >> a wrong fragment).
 > >> ...
 > 
 > >AFAIK SK GENESIS has no programming interface for a watermark.
 > >Some advanced hardware provides a way to interrupt when it reaches
 > >a programmed threshold but SK does not. It just provides a way whether
 > >hardware should raise an interrupt depending on Tx descriptor value.
 > >By tracking number of index it's possible to generate an interrupt
 > >for every N frames instead of every frame(1 <= N <= MAX Tx. Desc.).
 > 
 > I only have a Yukon, and think that's what I do, with a very variable N.
 > (Do we mean the same thing by the "Tx descriptor value"?  I mean

Yes.
 > SK_TXCTL_EOF_INTR.  Surely that's portable -- it's used in all versions
 > of sk with no ifdefs for GENESIS.).
 > 
 > My sk_start() tries to fill the tx ring (to length 512) and then put
 > an interrupt mark only on the last fragment in a packet nearest to 32
 > from the end, so in the best case N is about 480, but it us less if
 > tx is not streaming.  Cases where there is not much choice are harder
 > to program.  I had some success with removing interrupt marks and with
 > dummy packets of length 0 whose purpose is just to hold an interrupt
 > mark, but I don't trust those methods.  I didn't try putting an
 > interrupt mark on fragments in the middle of a packet.  That would be
 > simpler if it works.
 > 

I think it would take a long time to generate an Tx completion
interrupt for committed frames(every frame vs. the last frame) The
hardware may have some free Tx descriptors before generating an
Tx completion interrupt. I guess it would be more efficient if we
know there are some free Tx descriptors and use it before waiting for
an Tx completion interrupt. Just waiting for a completion interrupt
would add additional latency. Anyway, I have to experiment it.

 > >We may also need to add a routine to reclaim pending Tx descriptors
 > >before sending frames in sk_start if number of available Tx descriptors
 > >are less then a threshold.
 > 
 > I'm not sure what you mean here.  If there are < 32 tx descriptors
 > available, AND there is an (active) descriptor with an interrupt mark,
 > then my sk_start() just sets IFF_OACTIVE and returns.  The case where
 > there are < 32 tx descriptors but no descriptor with an interrupt mark
 > is trickier: a mark must be added, and I don't trust adding it to an
 > active packet, so it must be added to a new packet, but it might be
 > impossible to add one for the following reasons:
 > - no space.  The magic 32 is hopefully enough.
 > - no packets in the ifq.  My sk_start() tries to leave a spare one when
 >   one might be needed, but I think upper layers can eat it.
 > A dummy packet of length 0 can be used to handle both cases but may be
 > bad for the network -- does the hardware send a frame with no data?

I can't sure.
Since you know when you have to insert interrupt mark in sk_encap
I think you can use m_defrag and set SK_TXCTL_EOF_INTR.

 > 
 > >However I don't know how the driver should handle transmit errors
 > >occurred between interrupt-less Tx operations. Just flushing all
 > >committed frames would result in poor TCP performance.
 > 
 > Doesn't the hardware just proceed to the next packet without interrupting
 > (except possibly for a special error interrupt), and anyway act the same
 > as if the interrupt were delayed by interrupt moderation?  Errors for
 > individual packets don't seem to be detected or reported in either case.
 > 

Yes that is the problem. It seems that there is no way to know which
packet caused Tx errors and I think we have no choice but flushing
entire FIFOs. SK just flushes all frames in FIFO if it detect Tx
FIFO underrun or Rx FIFO overflow. But I can't sure how Yukon should
handle this case. The flushing routine in sk is guess work from
Linux skge implementation and I don't know internal details of Yukon
hardware. Since Yukon uses defferent registers to flush FIFOs and the
existence of unique registers related with interrupt and FIFOs I guess
it uses completely different approach.

 > >The difference between Yukon and SK hardware also make it hard to
 > >implement above interrupt-less Tx operations. There is no publicly
 > 
 > My version is not interrupless, but tries to use tx interrupts for
 > everything, just not many of them.
 > 

Ok, I'll take your idea and will try to experiment it next week.

-- 
Regards,
Pyun YongHyeon