Re: cvs commit: src/sys/ufs/ffs ffs_alloc.c ffs_softdep.c

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Fri, 23 Feb 2007 16:31:30 -0500
On Fri, Feb 23, 2007 at 01:16:54PM -0800, Brian Somers wrote:
> On Fri, 23 Feb 2007 15:41:12 -0500
> Kris Kennaway <kris_at_obsecurity.org> wrote:
> 
> > On Fri, Feb 23, 2007 at 08:23:36PM +0000, Brian Somers wrote:
> > > brian       2007-02-23 20:23:36 UTC
> > > 
> > >   FreeBSD src repository
> > > 
> > >   Modified files:
> > >     sys/ufs/ffs          ffs_alloc.c ffs_softdep.c 
> > >   Log:
> > >   Account for di_blocks allocations when IN_SPACECOUNTED is set in an
> > >   inode's i_flag.
> > >   
> > >   It's possible that after ufs_infactive() calls softdep_releasefile(),
> > >   i_nlink stays >0 for a considerable amount of time (> 60 seconds here).
> > >   During this period, any ffs allocation routines that alter di_blocks
> > >   must also account for the blocks in the filesystem's fs_pendingblocks
> > >   value.
> > >   
> > >   This change fixes an eventual df/du discrepency that will happen as
> > >   the result of fs_pendingblocks being reduced to <0.
> > >   
> > >   The only manifestation of this that people may recognise is the
> > >   following message on boot:
> > >   
> > >       /somefs: update error: blocks -N files M
> > >   
> > >   at which point the negative pending block count is adjusted to zero.
> > 
> > \o/ I hate that bug!
> 
> As do I!  As a result of the bug, all Sophos mail appliance
> customers had to suffer bi-weekly reboots for the past year
> (well, it was hardly the fault of this bug initially!).
> 
> It took weeks to fix -- I have never been able to reproduce
> the problem on demand and had to resort to inserting copious
> amounts of diagnostics on several machines then sitting around
> 'till ffsinfo said one of the machines had "bitten".
> 
> Until recently, even that strategy didn't work as our test
> machines just wouldn't see the problem.  We recently released
> a less powerful version of the appliance, and only then did
> we see the problem reasonably frequently (between 4 and 24
> hours usually).

Wow, nice job :)

Kris
Received on Fri Feb 23 2007 - 21:31:31 UTC