UFS Crash and directories now missing

Robert Bonomi bonomi at mail.r-bonomi.com
Sat Apr 28 15:39:18 UTC 2012


 Alejandro Imass <aimass at yabarana.com> wrote:
> On Sat, Apr 28, 2012 at 3:22 AM, Wojciech Puchar
> <wojtek at wojtek.tensor.gdynia.pl> wrote:
> >> I somewhat agree, but it wasn't a person. I am the only administrator,
> >> the only one with root access. The jails were effectively moved to the
> >> /usr/local/etc/apache22 of the single that survived at the top level.
> >> I'm thinking something between mount, EzJail, the journal and the way
> >> MySQL created a great deal of head contention, so something must have
> >> gotten corrupted at the directory level like you state, but the
> >> strange part is no _data_ corruption as such, because I was able to
> >> physically archive the jails, move them to the correct directory and
> >
> >
> > no matter what you do FreeBSD DOES NOT ramdomly move directories. if you are
> > sure you didn't move it yourself then it must be machine hardware problem
> > but still unlikely.
>
> After a little more research, ___it it NOT unlikely at all___ that
> under high distress and a hard boot, UFS could have somehow corrupted
> the directory structure, whilst maintaining the data intact.

This is techically accurate, *BUT* the specifics of the quote "corruption"
unquote in the case under discussion make it *EXTREMELY* unlikely that this
is what happened.

99.99+++% of all UFS filesystem "corruption' issues are the result of a 
system crash _between_ the time cached 'meta-data' is updated in memory
and that data is flushed to disk (a deferred write).

The second most common (and vanishingly rare) failure mode is a powerfail
_as_ a sector of disk is being written -- resulting in 'garbage data' 
being written to disk.

The next possibility is 'cosmic rays'.  If running on 'cheap' hardware (i.e.,
without 'ECC' memory), this can cause a *SINGLE-BIT* error in data being
output.

The fact that the 'corrupted' filesystem passed fsck -without- any reported
errors shows that everything in the filesystem meta-data was consistent

Given *that*, there are precisely *TWO* ways that the 'results' that have 
been reported could have happened.

  1) "Something" did a mv(2) of the various jail directories 'from' their
     original location to the 'apache' diretory.  This involves simply
     *copying* the diretory entry from the jail's 'parent directory' to
     the apache directory, and then marking the entry in the original 
     parent as 'unused'.  Nothing other than the  directory whre the jail
     'used to live', and the directory 'where it was found' are touched.
     This occured _through_ the system 'mv' function, so all the normal
     'housekeeping' was done properly.

  2) it was -not- done though mv(2) -- but that requires that a whole 
     *series* of "corruptions" of the filesystem, _ALL_ of which had to 
     occur in 'exactly' the right way.  They are:
       1) The -size- (filesystem metadata) of the orignal parent directory 
	  had to be changed to reflect the smaller size.
       2) the 'indirect block' info for the original parent directory had to
	  be changed to reflect the absense of the block(s) that are no
	  longer part of that file.
       3) the _size_ of the Apache directory had to be increased to reflect
	  the additional block(s) that are now part o that directory.
       4) the 'indirect block' info for the apache directory has to be
	  changed to reflect the presense of the new block(s) that are now
	  part of that file.

    This requires multiple -hundreds- of bits 'in error', in a minimum of
    FOUR separate disk locations. A -single- failure simply *CANNOT* cause
    all of this.

The probability of a random single-bit error in a gigabyte of RAM is on the
order of one such occurance in six months.  The odds of having multiple 
*simultaneous* errors is the probability of a single-bit error raised to
the power of the number of bits in error.  e.g. the probability of a
simultaneous 10-bit radom error is roughly 1 in 30 million years.  The odds
of it being a -specific- ten bits out of that gigabyte is preposterously
small.  The odds of the required specific _multiple-hundreds_ of bits in 
error occuringis (conservatively) 1 in
  (30 million years)**50 * ((2**30)!) / ((2^9)!)

The first factor, alone, is over 7.1E373 years.

I think it is safe to conclude that the probabilities -greatly- favor
alternative #1.



More information about the freebsd-questions mailing list