UFS2 tuning for heterogeneous 4TB file system

Fri Jul 31 13:29:40 UTC 2009

On 7/26/09, Maxim Khitrov <mkhitrov at gmail.com> wrote:
> On Sun, Jul 26, 2009 at 3:56 AM, b. f.<bf1783 at googlemail.com> wrote:
>>>The file system in question will not have a common file size (which is
>>>what, as I understand, bytes per inode should be tuned for). There
>>>will be many small files (< 10 KB) and many large ones (> 500 MB). A
>>>similar, in terms of content, 2TB ntfs file system on another server
>>>has an average file size of about 26 MB with 59,246 files.
>>
>> Ordinarily, it may have a large variation in file sizes,  but can you
>> intervene, and segregate large and small files in separate
>> filesystems, so that you can optimize the settings for each
>> independently?
>
> That's a good idea, but the problem is that this raid array will grow
> in the future as I add additional drives. As far as I know, a
> partition can be expanded using growfs, but it cannot be moved to a
> higher address (with any "standard" tools). So if I create two
> separate partitions for different file types, the first partition will
> have to remain a fixed size. That would be problematic, since I cannot
> easily predict how much space it would need initially and for all
> future purposes (enough to store all the files, yet not waste space
> that could otherwise be used for the second partition).
>

Perhaps gconcat(8), gmirror(8),  or vinum(4) will solve your problem
here.  I think there are other tools as well.

>>>Ideally, I would prefer that small files do not waste more than 4 KB
>>>of space, which is what you have with ntfs. At the same time, having
>>>fsck running for days after an unclean shutdown is also not a good
>>>option (I always disable background checking). From what I've gathered
>>>so far, the two requirements are at the opposite ends in terms of file
>>>system optimization.
>>
>> I gather you are trying to be conservative, but have you considered
>> using gjournal(8)?  At least for the filesystems with many small
>> files?  In that way, you could safely avoid the need for most if not
>> all use of fsck(8), and, as an adjunct benefit, you would be able to
>> operate on the small files more quickly:
>>
>> http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064043.html
>> http://www.freebsd.org/doc/en_US.ISO8859-1/articles/gjournal-desktop/article.html
>>
>> gjournal has a lower overhead than ZFS, and has proven to be fairly
>> reliable.  Also, you can always unhook it and revert to plain UFS
>> mounts easily.
>>
>> b.
>>
>
> Just fairly reliable? :)
>

Well, I'm not going to promise the sun, the moon, and the stars.  It
has worked for me (better than softupdates, I might add) under my more
modest workloads.

> I've done a bit of reading on gjournal and the main thing that's
> preventing me from using it is the recency of implementation. I've had
> a number of FreeBSD servers go down in the past due to power outages
> and SoftUpdates with foreground fsck have never failed me. I have
> never had a corrupt ufs2 partition, which is not something I can say
> about a few linux servers with ext3.
>
> Have there been any serious studies into how gjournal and SU deal with
> power outages? By that I mean taking two identical machines, issuing
> write operations, yanking the power cords, and then watching both
> systems recover? I'm sure that gjournal will take less time to reboot,
> but if this experiment is repeated a few hundred times I wonder what
> the corruption statistics would be. Is there ever a case, for
> instance, when the journal itself becomes corrupt because the power
> was pulled in the middle of a metadata flush?
>

I'm not aware of any such tests, but I wouldn't be surprised if  pjd@
or someone else who was interested in using gjournal(8) in a demanding
environment had made some.  I'll cc freebsd-fs@, because some of them
may not monitor freebsd-questions.  Perhaps someone there has some
advice.  You might also try asking on freebsd-geom at .

Regards,
                     b.

> Basically, I have no experience with gjournal, poor experience with
> other journaled file systems, and no real comparison between
> reliability characteristics of gjournal and SoftUpdates, which have
> served me very well in the past.
>
> - Max
>