kern/118093: firewire bus reset hogs CPU, causing data to be lost

Dieter freebsd at sopwith.solgatos.com
Thu Dec 18 01:00:12 PST 2008


The following reply was made to PR kern/118093; it has been noted by GNATS.

From: Dieter <freebsd at sopwith.solgatos.com>
To: freebsd-firewire at freebsd.org, bug-followup at freebsd.org
Cc:  
Subject: Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost 
Date: Thu, 18 Dec 2008 00:51:41 +0000

 > > : Warner>  This can't be the case.  There's no SPL involved at all.  Maybe
 > > : Warner>  removing the printfs causes an interrupt to be serviced faster,
 > > : Warner>  resulting in what appears to be better performance...
 > > : 
 > > : With the printfs, Ethernet is not getting serviced.  This
 > > : is repeatable and easily reproduced.   Without the printfs,
 > > : it seems ok.
 > > : 
 > > : If it isn't spl, what is locking out Ethernet?
 > >
 > > interrupt storm?  In old-spl-locked drivers, often times the interrupt
 > > would be masked while certain operations happened.  In new
 > > mutex-locked freebsd, there's no way to block the interrupts, so
 > > sometimes old code gets into an interrupt storm, which starves other
 > > things.  Not sure why printf would trigger this, however...
 
 I tried logging in on the console and "cat /etc/termcap" does
 not create a problem.  Systat -vmstat reports that IRQ 5 (console)
 interrupts go from 0 to 120.  CPU:
 	0.0%Sys   0.6%Intr  0.0%User  0.0%Nice 99.4%Idle
 
 The normal firewire bus reset printfs generate 890 IRQ 5 interrupts
 with the shell firing off "fwcontrol -u 1 -r" in a loop.
 CPU goes from idle to 100% busy:
 	98.1%Sys   1.7%Intr  0.3%User  0.0%Nice  0.0%Idle
 
 "fwcontrol -u 0 -r" (the onboard firewire controller) gives
 the same 890 IRQ 5 interrupts, but CPU usage is different:
 	79.2%Sys  20.8%Intr  0.0%User  0.0%Nice  0.0%Idle
 
 A couple of other IRQs go up to 30-40 range, nothing huge.
 
 A single "fwcontrol -u 0 -r" generates 14 interrupts on IRQ 5
 and 4 interrupts on 3 other IRQs.
 
 A single "fwcontrol -u 1 -r" generates 14 interrupts on IRQ 5
 and 3 interrupts on 2 other IRQs.  yawn
 
 ------
 
 With the bus reset printfs commented out, I get:
 
 -u 0
 
    80.0%Sys   4.8%Intr 13.9%User  0.0%Nice  1.2%Idle
 
    3300 interrupts on the onboard firewire controller
 
 -u 1
 
    84.8%Sys   1.1%Intr 14.1%User  0.0%Nice  0.0%Idle
 
    3900 interrupts on the PCI card firewire controller
 
    After awhile the console says:
 
    fwohci1: phy read failed(1). i = 100 retry = 0
    fwohci1: phy read failed(1). i = 100 retry = 1
    fwohci1: phy read failed(1). i = 100 retry = 2
    fwohci1: phy read failed(1). i = 100 retry = 3
    ...
 
    and the system becomes unresponsive.  Reset button required.  :-(
 
 Observations:
 
 Kernel printfs to console appear to take more interrupts and
 more CPU than userland output to console.
 
 Seems odd that the onboard controller uses so much CPU for
 interrupts in the with-printf case, but there wasn't a
 large number on interrupts.  The without-printf case had
 a LOT more interrupts from the fw controller.
 
 Resetting the firewire bus in an endless loop is not recommended.
 
 I'm not an expert on interrupt storms, but none of this looks
 like an interrupt storm to me.  (Assuming systat is trustworthy.)
 
 Looks like the kernel printf takes a *lot* of CPU, and
 somehow locks out Ethernet.


More information about the freebsd-bugs mailing list