igb interrupt moderation

Barney Cordoba barney_cordoba at yahoo.com
Sun Jan 3 16:44:23 UTC 2010


--- On Sun, 1/3/10, Michael Tüxen <Michael.Tuexen at lurchi.franken.de> wrote:

> From: Michael Tüxen <Michael.Tuexen at lurchi.franken.de>
> Subject: Re: igb interrupt moderation
> To: "Barney Cordoba" <barney_cordoba at yahoo.com>
> Cc: freebsd-net at freebsd.org, jfvogel at gmail.com
> Date: Sunday, January 3, 2010, 8:55 AM
> Hi Barney, Hi Jack,
> 
> some comments and some more questions inside...
> 
> Best regards
> Michael
> 
> On Jan 2, 2010, at 8:42 PM, Barney Cordoba wrote:
> 
> > Jack,
> > 
> > I'm trying to get some clarification on differences
> I'm finding between
> > the 82575 and 82576 parts with respect to interrupt
> moderation. The spec
> > I have for the 82576 (82576_Datasheet_v2p1.pdf)
> indicates that the 
> I'm only commenting 82576. You can get rev 2.41 from intels
> website...
> > 
> > ITR algorithm is different than the one used (I don't
> have one of the
> > secret copies of the 82575 spec). The algorithm shown
> is
> > 
> > interrupts/sec = 1/(2 * 10-6sec x interval) (page 295,
> Section 7.3.4)
> > 
> > which is clearly wrong from practice. I have an 82576
> (device id 10C9)
> If you look at section 8.8.12, you find other formulas...
> Jack: Which ones are correct?
> > if I use the 125d setting in the example get just
> under 32000 interrupts
> > per second. Clearly your code doesnt implement this,
> nor do you have
> > different settings for the 82575 and 82576 parts. So I
> assume that the 
> > same formula for the em parts hold for the igb parts,
> and that the 
> > datasheet is wrong?
> > 
> > There does seem to be a slight difference. The setting
> that gets 1000
> > ints/second on the 82575 generates about 1020 on the
> 82576. Not a big
> > deal but I wonder why there's a difference? Is the
> reference clock for
> > these something that may not be fixed and could vary
> from board to 
> > board? Note that both devices are on the same MB.
> > 
> > Also, it seems that settings to EITR over 32767 wrap
> on the 82576 (for
> > example writing 32768 to EITR is the same as writing a
> 1). So the  minimum setting on the 82576 is around 125
> ints/second. The 82575 can accept 
> > values up the 65535 before wrapping. 
> Hmm, looking at the table in 8.8.12 would suggest:
> Setting it to one sets a reserved bit, but does not change
> the interval.
> Setting it to 2^15 should set the LLI_EN bit, but does not
> change in interval.
> 
> Jack is setting the register to
> igb_low_latency: 128
> igb_ave_latency: 450
> igb_bulk_latency: 1200
> 
> This would result in intervals of:
> igb_low_latency: 32
> igb_ave_latency: 112
> igb_bulk_latency: 300
> Jack: What are the corresponding interrupt rates? The spec
> provides different
>       formulas and talks about a 1us, 2us or
> 8us counter. Not sure what is right...
> Jack: Why are you setting bit1 (which is reserved) in the
> case igb_ave_latency?
> 
> And another question for Jack:
> In igb_update_aim() you do
>     if (olditr != newitr) {
>         /* Change interrupt
> rate */
>         rxr->eitr_setting
> = newitr;
>        
> E1000_WRITE_REG(&adapter->hw,
> E1000_EITR(rxr->me),
>             newitr
> | (newitr << 16));
>     }
> So why are setting the higher bits of the EITR? You are
> setting
> igb_low_latency: the LL Counter becomes 0, the moderation
> counter becomes 16
> igb_ave_latency: the LL Counter becomes 2, the moderation
> counter becomes 56
> igb_bulk_latency: the LL Counter becomes 16, the moderation
> counter becomes 148
> 
> I really do not understand these settings. Maybe the spec
> is wrong? Or you do mean
>     if (olditr != newitr) {
>         /* Change interrupt
> rate */
>         rxr->eitr_setting
> = newitr;
>        
> E1000_WRITE_REG(&adapter->hw, E1000_EITR(rxr->me),
> newitr);
>     }
> Or do you want to preserve the counters, set the CNT_INGR
> bit and mean
>     if (olditr != newitr) {
>         /* Change interrupt
> rate */
>         rxr->eitr_setting
> = newitr;
>        
> E1000_WRITE_REG(&adapter->hw, E1000_EITR(rxr->me),
> 0x80000000 | newitr);
>     }
> 
> Could you clarify that?
> > 
> > The 82576 document doesn't have a map of the register
> that I can find, so
> > Im curious as to whether these observations are
> something I can assume is
> > true across all parts and motherboards/cards, or is
> there some
> > implementation variance that will cause these to only
> apply to the ones
> > I happen to be testing?
> > 
> > Thanks,
> > 
> > Barney

Ah, the register map in the older spec doesn't have the full or
correct information :\

Note that ripping out intel's auto-moderation was one of my first tasks,
so I can't comment on how how they derive those values (which are very different than in LINUX). Its supposed to be based on average packet
size, so I'm not sure what Jack is doing with some of these settings.
As for the EITR settings,the datasheet is just plain wrong. For example in 
section 7.3.3.1 it says that a setting of 125d would result in 8000
interrupts per second; in practice that value results in about 32K
interrupts per second. The 82575 seems to use the same algorithm
as the em class devices:

1,000,000,000 / (256 * ints_per_sec)  

so "low-latency" is 30,517 interrupts per second, while "bulk latency"
is 3,255, which is way, way way too high.  So with 4 queues, you have a
minimum of 13K interrupts per second. Its just a concept that hasn't been
 thought out or tested in practice. Its also absolutely ridiculous to
 adjust the  moderation on every interrupt. You can't interpret traffic
 patterns in 1/8000th of a second. Also, notice that the bulk threshold
is 10,000 bytes, and the bulk setting is 3255 int/sec. Well in order to
receive 10,000 bytes in 1/3255th of a second, which is 260Kb/s,
with 4 queues, assuming reasonable distribution, you'd have to be
receiving at more than wire speed to stay in bulk for more than 1
interrupt time. And how could the settings be the same for 1 queue as
for 4 queues?

I don't see any reference to what LLI Moderation Enable bit might do. It
doesn't seem to do anything; setting it or not setting it seems to 
result in the same level of moderation.

Barney




      


More information about the freebsd-net mailing list