Persisting troubles with periodic stalls every few minutes

Bartosz Fabianowski freebsd at chillt.de
Wed Jan 19 17:36:19 PST 2005


Hi list,

I have been having a lot of trouble with performance on my laptop, which 
I first set up with 5.3-RELEASE and constantly keep up to date with 
5.3-STABLE. The box runs stable, but periodically, somewhere between 
every few minutes down to every few seconds, it stalls for 5 seconds. By 
that I mean that the screen is not being refreshed and all keyboard 
strokes go into some kind of buffer to get processed when the stall is 
over. Some key strokes also get lost or reversed in order, which makes 
this even more annoying.

I know that during the stalls, CPU usage goes up to 100%, so it is some 
process that periodically wakes up and hogs the CPU for a few seconds. 
Also, during the stalls, there is always a lot of disk activity. The 
only "special" thing about this machine is that /usr/home is GBDE 
encrypted. But even when I am not doing anything on that partition, 
stalls occur.

The rate of stalls varies a depending on what I am doing. Even when the 
box is idle, it does stall periodically. But when I am making world, the 
box becomes almost completely unusable as disk activity on /usr (which 
is not GBDE encrypted) triggers the same symptoms.

The trouble is that I cannot figure out how to find the responsible 
process. Tools such as top(1) update in one second intervals at best and 
as there are no screen updates during the stall, so they produce nothing 
useful. The only tool that gave me some kind of information was 
systat(1). When I invoke "systat -vmstat 1", I see the following:

When everything is working normally, the CPU is at:
15% system    30% user    65% idle

At the first screen update after a stall, the CPU is at:
15% system    85% user     0% idle

Also, the VM statistics such as "zfod" and "ofod" have jumped from their 
usual zero level to several thousand. And disk activity is reported as 
high, of course. A couple seconds after the stall is over, all 
statistics return to their normal values.

So, some user process is misbehaving. And has been doing so ever since 
this box was set up. Plus, it is somehow disk related and happens no 
matter what I am running or not, what I am doing or not.

Any ideas on how to debug this? How can I find the guilty process?

Thanks for any and all input,
- Bartosz Fabianowski

PS: According to sysctl, DMA is enabled on both ata and atapi so that is 
not the issue.


More information about the freebsd-stable mailing list