Recurring problem: processes block accessing UFS file system

Ronald Klop ronald-freebsd8 at klop.yi.org
Tue Nov 22 03:02:11 GMT 2005


On Tue, 22 Nov 2005 00:54:09 +0100, Greg Rivers  
<gcr+freebsd-stable at tharned.org> wrote:

> I've recently put up three busy email relay hosts running 6.0-STABLE.  
> Performance is excellent except for a nagging critical issue that keeps  
> cropping up.
>
> /var/spool is its own file system mounted on a geom stripe of four BSD  
> partitions (details below).  Once every two or three days all the  
> processes accessing /var/spool block forever in disk wait.  All three  
> machines suffer this problem.  No diagnostic messages are generated and  
> the machines continue running fine otherwise, but a reboot is required  
> to clear the condition.  This problem occurs during normal operation,  
> but is particularly likely to occur during a backup when dump makes a  
> snapshot.
>
> There doesn't appear to be a problem with gstripe, as gstripe status is  
> "UP" and I can read the raw device just fine while processes continue to  
> block on the file system.  I tried running a kernel with WITNESS and  
> DIAGNOSTIC, but these options shed no light.
>
> If I catch the problem early enough I can break successfully into kdb;  
> otherwise, if too many processes stack up, the machine hangs going into  
> kdb and must be power-cycled.
>
> I'd appreciate any insight anyone may have into this problem or advise  
> on turning this report into a coherent PR.

I have a machine with 5.4-STABLE with the same problem. It hangs every  
couple of days if I make regular snapshots. It is a remote machine which I  
don't have easy access to. I disabled the snapshots and since than it  
didn't hang a single time.
I hoped it would be fixed in 6.0, but this sounds the same.

Ronald.

-- 
  Ronald Klop
  Amsterdam, The Netherlands


More information about the freebsd-stable mailing list