Debugging server hangs in 7.2-RELEASE

Marc G. Fournier scrappy at hub.org
Sun May 10 23:46:25 UTC 2009


I am so completely running out of ideas on how to debug this, maybe 
someone else has some ideas?

The problem appears to be that very suddenly, the disk busy (according to 
vmstat) skyrockets to >100 (from 0) and then the 'runnable but swapped' 
column slowly rises ...

One person suggested that for them, they saw similar when msi/msi-x was 
enabled ... after searching the source code, I found that msi was used in 
the bge driver, but I couldn't find msix used anywhere else on that 
machine, so disabled msi ... its still exhibiting the issue ...

I get no errors on the serial console to indicate any problems, and until 
a relatively recent upgrade of the kernel ( (I can't give an exact date), 
this server was one of my most solid ...

I figure there is a single process that is starting up on the machine that 
is causing this, but no matter what I try, it is eluding me.

I have KDB enabled in the kernel, and the serial console setup so that I 
can break to it ... but when this problem happens, doing 'cr ~ ^b' through 
the serial console doesn't do anything, or, it just prints the message 
about breaking to the debugger and then hangs there ...

My next option is to start time travelling backwards to see if I can find 
a 'stable kernel' again, but if it is just one process causing this, then 
going back to older kernels isn't necessarily going to accomplish anything 
...

Is there something else I can do here to debug this?  Its hard to believe 
we are such an advance OS, but debugging issues like this is so elusive :(



----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email . scrappy at hub.org                              MSN . scrappy at hub.org
Yahoo . yscrappy               Skype: hub.org        ICQ . 7615664


More information about the freebsd-stable mailing list