UFS Crash and directories now missing

Alejandro Imass ait at p2ee.org
Mon Apr 30 18:39:24 UTC 2012

On Mon, Apr 30, 2012 at 1:57 PM, jb <jb.1234abcd at gmail.com> wrote:
> Alejandro Imass <ait <at> p2ee.org> writes:
>> If you have really followed the thread, all I have done is try to find
>> some explanation for a strange behavior of the system under normal
>> use. It hung, and some directories were moved, period. I have posted
>> some ideas to share with other people expecting some insight and maybe
>> similar experience from other users, which there probably are many,
>> but many times afraid to speak up and avoid getting insulted.
>> ...
> I looked at problem reports for nullfs and there are quite few.
> Hierarchical Jails
> You said you have your jail env on a separate disk.


> I looked at problem reports for nullfs and there are quite few.
> http://www.freebsd.org/cgi/query-pr-summary.cgi?category=&severity=&priority=&cl
> ass=&state=&sort=none&text=nullfs&responsible=&multitext=&originator=&release=
> As a matter of fact I just mounted a nullfs but was not able to unmount it
> (device busy) - a Google search shows it as a problem reported for many many
> years.
> Nullfs does not seem to be stable.

Dirk Engling guessed that somehow nullfs was involved.

> Anyway, I found one PR
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/147420
> that is about troubles with jails, nullfs, UFS, and NFS.
> Synopsis:       [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt inode)
> Take a look at this paragraphs:
> "...
> After two more failures, I now found the offending inode ..."
> "...
> As one point, I found the inode in a directory which usually is mounted for
> an (ez-) jail via nullfs."
> This proves that problems with jails, nullfs, and fs corruption are possible.
> So, they can not be excluded up front in your case too because nullfs is just
> a simple "path translation".

Up until yesterday (and Dirk's answer) I didn't look for specific
references to nullfs, and today I was busy getting vicious myself ;)

Thanks for pointing a plausible cause. What I have done so far is
limit the offending jail to a specific cpuset and I wanted to add
another disk to avoid contention with other jails. MySQL not only
consumes the whole CPUs but also limits the whole drive, while it's
doing some crazy full scan query on a very large database.

I don't have any control of the code or the MySQL myself and the
client said it's known problem with VTiger CRM and the way it
implements some reports that I guess were not designed for the amount
of data they are handling. I have already recommended they move to a
dedicated server altogether because their system simply outgrew what
we sold them.

I really appreciate the time you dedicated to search for a possible
explanation and at the very least it helps in taking some immediate
steps to avoid it from happening again. Hopefully, someone with deep
knowledge will find the root cause and a long-term fix. What is true,
that if it happened to me, it can happen to anyone, so maybe your
findings will help someone pin-point the problem and fix it.



More information about the freebsd-questions mailing list