Crazy "interrupt storm detected" on atapic0

Nicolai ns at got2get.net
Tue Mar 17 07:07:58 PDT 2009



Hi all, 

I have had this problem since day 1 on my new server. 

It has run since November 15th 2008, and serve approx. 10 GB worth of web
traffic per month for the main site and then some 40 domains with mail and
small web pages. (hence - it's NOT that busy yet) 

I started with 7.1-RELEASE-pX since I didn't have problems straight off -
but it didn't last long. 

After a few days of running, the interrupt storm on atapci0 starts to
show. It slowly builds up and continues. When it reaches 150-200k/sec. I
reboot just to be on the safe side. 

I have also upgraded to 7.1-STABLE to get all the ATA driver changes
S.O.S. have been including. Still no visible change. 

To give you an impression of its impact, I will let the numbers speak for
thmeselves: 

$ uname -v
FreeBSD 7.1-STABLE #1: Thu Mar 12 14:22:49 CET 2009 

$ uname -m
amd64 

$ uptime
 2:36PM up 4 days, 22:12, 5 users, load averages: 0.28, 0.40, 0.19 

$ tail -10 messages
Mar 17 13:42:37 box last message repeated 600
times
Mar 17 13:52:37 box last message repeated 600 times
Mar 17 14:02:37 box last message repeated 600 times
Mar 17 14:12:37 box last message repeated 600 times
Mar 17 14:22:37 box last message repeated 600 times
Mar 17 14:32:22 box last message repeated 585 times
Mar 17 14:32:23 box kernel: pid 78195 (try), uid 0: exited on signal 10
(core dumped)
Mar 17 14:32:23 box kernel: interrupt storm detected on "irq22:";
throttling interrupt source
Mar 17 14:32:54 box last message repeated 31 times
Mar 17 14:34:55 box last message repeated 121 times 

$ vmstat -i
interrupt total rate
irq1: atkbd0 3 0
irq9: acpi0 1 0
irq16: ohci0 1 0
irq17: ohci1 ohci3 1 0
irq18: ohci2 ohci4 1 0
irq22: atapci0 57317362717 134713
cpu0: timer 850996016 2000
cpu1: timer 850995703 2000
Total 59019354443 138713 

[root at box /etc]# atacontrol mode ad4
current mode = SATA300
[root at box /etc]# atacontrol mode ad6
current mode = SATA300 

Some relevant lines from dmesg: 

atapci0: 
port
0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem
0xfe7ff800-0xfe7ffbff irq 22 at device 18.0 on pci0
atapci0: [ITHREAD]
atapci0: AHCI Version 01.10 controller with 4 ports detected
ata2:  on atapci0
ata2: [ITHREAD]
ata3:  on atapci0
ata3: [ITHREAD]
ata4:  on atapci0
ata4: [ITHREAD]
ata5:  on atapci0
ata5: [ITHREAD] 

And a few lines from pciconf: 

atapci0 at pci0:0:18:0: class=0x01018f card=0x73271462 chip=0x43801002
rev=0x00 hdr=0x00
 vendor = 'ATI Technologies Inc'
 device = 'IXP SB600 Serial ATA Controller'
 class = mass storage
 subclass = ATA 

...so - this is where I'm at. Interrupt storm raises through the roof in
just 3 days, and continues to raise. 

Just for kicks I tried disabling AHCI with nextboot, but that made the box
not boot. Also - I'm 1000 KM. away from the box - so I'm a little limited
to testing fancy boot options - apart from things that can go in
nextboot.conf. 

If anyone have any hints on how to proceed, I would be grateful.


Thank you in advance 

 - Nicolai


More information about the freebsd-stable mailing list