UFS Crash and directories now missing

Alejandro Imass aimass at yabarana.com
Sun Apr 29 01:58:19 UTC 2012

>> >> I somewhat agree, but it wasn't a person. I am the only administrator,
>> >> the only one with root access. The jails were effectively moved to the
>> >> /usr/local/etc/apache22 of the single that survived at the top level.
>> >> I'm thinking something between mount, EzJail, the journal and the way
>> >> MySQL created a great deal of head contention, so something must have
>> >> gotten corrupted at the directory level like you state, but the
>> >> strange part is no _data_ corruption as such, because I was able to
>> >> physically archive the jails, move them to the correct directory and
>> > no matter what you do FreeBSD DOES NOT ramdomly move directories. if you are
>> > sure you didn't move it yourself then it must be machine hardware problem
>> > but still unlikely.
>> After a little more research, ___it it NOT unlikely at all___ that
>> under high distress and a hard boot, UFS could have somehow corrupted
>> the directory structure, whilst maintaining the data intact. From what
>> I've learned so far, UFS is actually divided into 2 layers: one that
>> controls the directory structure and metadata and a lower layer
>> containing the data, so the directories being screwed up and the data
>> intact it is actually quite possible.
>> What I'm trying to do is figure out is how it happened, and try
>> prevent it from happening again, so instead of dismissing it as
>> impossibility, I think we all should spend a little time figuring out
>> how these things can happen and determine how it can be prevented or
>> reduced.
> somebody mentioned the links. Did you use links in the jails to access the data? If then the directories of the jails got screwed, the links are gone but the original data is still there. The damaged directory might got fixed during the first reboot after the crash and you never noticed the fix.

Hi Erich, thanks for your reply.

I don't know what links you are referring to, but please point me in
that direction. I initially suspected that it could have been the
journal recovery and/or fsck but as you can see, a couple of people
have said this is impossible, but have to admit my ignorance on some
specifics of the UFS filesystem, yet out of logic seems like the most
plausible explanation.

I've been running FBSD since 6.2 and jails since then as well.  Today
I run 6 public servers in 8.2 with between 15 to 20 jails each and we
switched to ezjail last year and use strictly by the book. I do use
flavours though, and I may archive and re-create jails with a specific
archive but always using ezjail-admin. Since all our servers are 8.2
and all updated the same, I may port jails from one server to the
other using the ezjail archive method, but nothing as stupid as
someone was suggesting that I was using cp or soft links.

I've never had any problems except in _this particular server_ where I
have client that has a problem with MySQL and under some conditions it
drains the whole server. I suspected corruption of the fs because of
all the contention generated by MySQL to the point where it simply
hung and had to hard-reboot. I doubt it's hardware because these are
relatively new servers Xeon X3370, 8GB RAM, 2 x 150GB 10,000rpm
Velociraptor disks. We have the pristine OS in one disk and jails in
the other. Nothing runs outside of jails, not even the MTA which runs
postfix inside one of the jails.

This is the first crash when anything like this has happened in over 6
years running FBSD, and I am surprised as anyone here because of the
weirdness of the jail directories moving like that. We had backups of
the previous night, but I didn't even use them. The data was all
there, intact, just moved inside the only surviving jail, which
happens to be the http reverse proxy of all the other jails.

If you have any leads as to how this can happen other than cosmic rays
I would greatly appreciate it.



