Interrupt storm with MSI in combination with em1
Daan Vreeken
Daan at vehosting.nl
Fri May 6 15:02:49 UTC 2011
On Thursday 05 May 2011 22:22:15 Jack Vogel wrote:
> On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken <Daan at vehosting.nl> wrote:
> > Hi Peter,
> >
> > On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote:
> > > On 2011-May-05 13:22:59 +0200, Daan Vreeken <Daan at vehosting.nl> wrote:
> > > >Not yet. I'll reboot the machine later today when I have physical
> > > > access to it to check the BIOS version. I'll keep you informed as
> > > > soon as I get another storm going.
> > >
> > > Depending on the quality of your BIOS (competence of the vendor), you
> > > might find that kenv(8) reports the BIOS version without needing a
> > > reboot.
> > > (Look at smbios.bios.* in the output).
...
> > smbios.bios.version="0303 "
...
> > Version "0402" is the latest and greatest, so it's time to upgrade.
> > According
> > to Asus it "Improves system stability", so let's see if this 'cures' IRQ
> > 16.
>
> Cool, thanks for the update! Good luck.
I've updated the BIOS and let the machine run for a couple of hours with
MSI/MSIX enabled. After 3 hours of uptime I see the storm again.
Here are the first couple of lines of output of "top -S" :
last pid: 33218; load averages: 0.47, 0.35, 0.33 up 0+03:52:1016:42:52
317 processes: 6 running, 289 sleeping, 22 waiting
CPU: 0.4% user, 0.0% nice, 0.5% system, 11.6% interrupt, 87.5% idle
Mem: 280M Active, 176M Inact, 1797M Wired, 8572K Cache, 32M Buf, 5545M Free
Swap: 500M Total, 500M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 4 171 ki31 0K 64K CPU0 0 893:17 351.95% idle
12 root 23 -80 - 0K 368K WAIT 2 18:37 50.39% intr
One core is spending half it's time handling interrupts.
/var/log/messages doesn't show any new message since the storm
started. "vmstat -i" now shows :
# vmstat -i
interrupt total rate
irq3: uart1 917384 63
--> irq16: ehci0 809547235 55608
irq23: ehci1 1751385 120
cpu0:timer 16380717 1125
irq256: em0:rx 0 1651907 113
irq257: em0:tx 0 1495708 102
irq258: em0:link 3 0
irq259: em1:rx 0 397227 27
irq260: em1:tx 0 257865 17
irq261: em1:link 6 0
irq262: re0 10549 0
irq263: ahci0 290926 19
cpu1:timer 1160008 79
cpu3:timer 763939 52
cpu2:timer 4120133 283
irq272: hdac0 819282 56
Total 839564274 57670
Apart from spending far too much time handling interrupts, the machine works
fine, so I'll let it run in case anyone wants me to try something on it.
As a next step to try to isolate the problem I could create a kernel with
MSI/MSIX enabled, but with a modified 'em' driver so it doesn't try to attach
the MSI/MSIX interrupts to see if the problem is really related to the
network cards or not.
If anyone has a better idea, I'm all ears :)
Regards,
--
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
More information about the freebsd-current
mailing list