regression: msk0 watchdog timeout and interrupt storm
Boris Samorodov
bsam at passap.ru
Thu Oct 31 19:05:41 UTC 2013
31.10.2013 17:33, Boris Samorodov пишет:
> 30.10.2013 06:16, Yonghyeon PYUN пишет:
>> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:
>
>>> >From time to time I use a notebook and boot FreeBSD from USB
>>> stick. FreeBSD 9.2-i386 works OK. So I tried to use
>>> FreeBSD 10.0-i386 BETA2 and the network adapter works for
>>> some 10-15 seconds and then stops with diagnostic message
>>> "msk0:watchdog timeout". I've found similar case at
>>> freebsd-current@ with no workaround. Yes, there is an
>>> interrupt storm as well.
>>
>> There had been no functional changes for very long time so I'm not
>> sure what's going on here. I've attached local change I have at
>> this moment but I'm afraid it wouldn't address the issue above.
>>
>> I recall jhb also reported interrupt storm in the past but the root
>> cause was not identified yet. Could you change msk_intr() and let
>> me know which interrupt is firing?
>
> I've yet to organize a build.
Success! The system is up, fetching and uploading for an hour now.
No more watchdog timeouts, storms and freeses (stable/10, i386,
r257422M, modified by your patch).
Big thank YOU!
>>> Here is some additional info:
>>> -----
>>> mskc0 at pci0:3:0:0: class=0x020000 card=0xff501179 chip=0x435511ab
>>> rev=0x12 hdr=0x00
>>> vendor = 'Marvell Technology Group Ltd.'
>>> device = '88E8040T PCI-E Fast Ethernet Controller'
>>> class = network
>>> subclass = ethernet
>>> cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0
>>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
>>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1)
>>> speed 2.5(2.5) ASPM disabled(L0s/L1)
>>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>>> ecap 0003[130] = Serial 1 b8b063ffff681e00
>>> -----
>
> Meanwhile some more investigations, "vmstat -i" for calm and storm:
> -----
> interrupt total rate
> irq1: atkbd0 1025 2
> irq9: acpi0 204 0
> irq14: ata0 327 0
> irq16: uhci0+ 246 0
> irq20: hpet0 22472 52
> irq23: uhci2 ehci1 10341 24
> irq256: hdac0 52 0
> irq257: mskc0 258 0
> irq258: ahci0 221 0
> Total 35146 81
> -----
> interrupt total rate
> irq1: atkbd0 1508 2
> irq9: acpi0 234 0
> irq14: ata0 409 0
> irq16: uhci0+ 246 0
> irq20: hpet0 72288 131
> irq23: uhci2 ehci1 10846 19
> irq256: hdac0 52 0
> irq257: mskc0 4419760 8021
> irq258: ahci0 221 0
> Total 4505564 8177
> -----
>
> And "vmstat -w1" for calm and storm:
> -----
> procs memory page disks faults cpu
> r b w avm fre flt re pi po fr sr mm0 ad0 in sy cs
> us sy id
> 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476
> 454 0 1 99
> 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123
> 137 0 0 100
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120
> 92 0 1 99
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123
> 119 0 1 99
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132
> 123 0 1 99
> 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123
> 185 0 1 99
> 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123
> 266 0 1 99
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125
> 124 0 0 100
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128
> 164 0 1 99
> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131
> 101 0 1 99
> -----
> procs memory page disks faults cpu
> r b w avm fre flt re pi po fr sr mm0 ad0 in sy cs
> us sy id
> 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204
> 44262 0 10 90
> 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123
> 222379 0 44 56
> 0 0 0 213648 954672 0 0 0 0 0 4 0 0 111792 123
> 221489 0 43 57
> 0 0 0 213648 954672 1 0 0 0 0 4 0 0 109887 183
> 217754 0 43 57
> 0 0 0 213648 954668 2 0 0 0 0 4 0 0 109543 146
> 216963 0 44 56
> 0 0 0 213648 954668 0 0 0 0 0 4 0 0 110142 123
> 218187 0 45 55
> 0 0 0 213648 954660 472 0 0 0 474 4 0 0 109340 717
> 216674 0 42 57
> 0 0 0 213648 954656 2 0 0 0 0 4 0 0 109459 147
> 216831 0 43 57
> 0 0 0 213648 954656 0 0 0 0 0 4 0 0 109462 131
> 216827 0 43 57
> 0 0 0 213648 954656 0 0 0 0 0 4 0 0 109454 123
> 216803 0 42 58
> -----
>
> Dmesg is here: ftp://ftp.wart.ru/pub/misc/tos.dmesg.boot.txt .
>
> BTW, some more observations. While downloading a file the system
> goto watchdog timeout rather quickly, but the system works. If I
> try to upload files the system works much longer (for a couple of
> minutes) but then freeses. No ctrl-alt-esc. Only cold restart works.
>
> Thanks!
>
--
WBR, Boris Samorodov (bsam)
FreeBSD Committer, http://www.FreeBSD.org The Power To Serve
More information about the freebsd-stable
mailing list