Blocked process

Daniel O'Connor doconnor at gsoft.com.au
Thu Aug 20 02:34:32 UTC 2009


Hi,
We have several systems doing data acquisition and I had originally 
thought we were seeing the interrupt handler for out PCI card not being 
called quickly enough, however I misread the diagnostics :)

The digitised data is fed into a FIFO and when it is part full 
(32kbytes) an interrupt is generated. The IRQ routine reads 32kbyte 
chunks into a kernel buffer (4Mbyte) until part full goes away. If the 
FIFO full flag is seen (it is latched by the hardware) then acquisition 
is halted.

The problem appears to now be that the userland process that reads data 
out of the kernel is being stalled for over 4 seconds. This process 
reads from the kernel and does some minor processing and then writes it 
out to a child process to do some more work on it.

I ran 'ps -xaulwww' in a loop every second to see what ELSE was using 
the CPU when it was stalled and found that my script stalled for 7 
seconds.

I tried increasing the buffer inside the kernel (to 8Mb) which seemed to 
have no effect, however renice'ing the process from -5 to -20 has 
greatly reduced the frequency of occurrence. WRT the buffer size - I 
would expect that if I increased it more it would reduce the problem 
but since I have only increased it to ~4 seconds worth and the stall is 
longer I see no effect.

Given that renice'ing has an effect it seems to be a scheduler problem, 
I don't see how it can be something to do with the motherboard stalling 
the whole system otherwise the FIFO full error would occur, however I 
only see the 4Mb kernel buffer filling up.

One other possibility would be something holding a lock for too long 
that blocks both the DAQ readout process and ps, however I am not sure 
how I would find out what.

Unfortunately the system is in Finland and I'm in Australia so I can't 
sit at the console :(

I am hoping to be able to replicate the HW & SW locally at some stage 
but haven't been able to yet.

Any help appreciated, thanks!

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 188 bytes
Desc: This is a digitally signed message part.
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090820/9d0526bf/attachment.pgp


More information about the freebsd-stable mailing list