UFS2 snapshots on large filesystems
anderson at centtech.com
Tue Nov 15 05:07:05 PST 2005
Oliver Fromme wrote:
> Eric Anderson <anderson at centtech.com> wrote:
> > Oliver Fromme wrote:
> > > [...]
> > > No. Your /var has just 663 inodes in use, and it has about
> > > 1.7 million unused inodes which is just a waste.
> > Oops! Thanks for the correction - I misread it in my pasting frenzy. :)
> > It may be a waste, but perhaps the right answer would be in the form
> > of a patch to make sysinstall create /var partitions with different
> > settings, if you feel strongly about it.
> Well, I don't feel very strongly about sysinstall, but I do
> think that too few people read the tuning(7) manpage. :-)
> I think sysinstall's default values (which just uses newfs'
> defaults) are a good trade-off. If you run out of inodes,
> then you are in serious trouble -- you probably have to re-
> create the whole file system (dump, newfs, restore or simi-
> lar). But if you have way too many inodes, then you waste
> some space and fsck time, but that's not a critical problem
> to most users, because at least it keeps running.
> That's probably the reason why the default values provide
> a rather high inode density. And after all, you _can_
> change it if you know what you're doing (after reading
> tuning(7) and otehr documentation). Even sysinstall pro-
> vides a way to enter newfs flags, so you can easily change
> the inode density from the beginning.
> It's also interesating to note that, historically, the /var
> partition is used to hold spool areas, such as the spool
> of news servers. INN's tradition spool layout (which is
> still popular for small servers because it allows better
> control) stores each article in a separate file, so you
> need a significant number of inodes in /var in that case.
> (Of course, for "big" news servers, you usually choose a
> different spool layout such as cycle buffers, and you
> don't put them on the /var partition but on their own
> optimized file system.)
> It all comes down to the fact that neither sysinstall nor
> newfs know in advance what purpose a file system will be
> used for, so they have no idea what default inode density
> would be suitable. So they choose rather conservative
> defaults for the "worst case", i.e. many inodes. It's up
> to the user to change the defaults if appropriate.
> Of course it's not an error to have way too many inodes.
> But I think it's a suboptimal setting, and it it always
> worth to think about the usage of the file system in ad-
> vance, before running newfs. Each inode takes 256 bytes
> in UFS2 (in UFS1 it's 128 bytes). On a 250 Gbyte disk
> (typical size nowadays), the default parameters will
> reserve space for 30 million inodes. That's 7,5 Gbyte
> reserved to inodes which will not be available to actual
> file data (and which adds to fsck time significantly).
Yes, I agree with you on all the above, but honestly, I guess I think
that 3% of a filesystem being used for inodes out-of-the-box on a 250GB
partition, isn't such a big deal, considering 8% is set aside for root
only use to keep the filesystem from getting too cluttered and
performing poorly. You have to weigh the savings from the drop in
inodes, vs the loss for fragmentation or non-optimal block usage. If
you are so close in space usage that you need that few percent
difference used from inodes, then I guess my thoughts are that better
space forcasting should have been in place. I'm all for efficient usage
of resources, but there's a point where the risk of reducing the inodes
because you are not certain about the usage pattern of the disk over
time is too high, versus letting a couple of GB disapper is insignificant.
> > Right, this is typical for the types of data I store, which often
> > average 8-16k per file, which I think is the default expectation for
> > UFS2 filesystems, so I'm making a generalization that a majority of
> > users also have a ~16k average filesize.
> I don't think that's true. The default values rather pre-
> sume the _minimum_ (not average) file size that most users
> will need, so that only very few users will hit the inode
> limit. If the newfs default was the expected average file
> size, then 50% of users would hit the limit (and then flood
> the mailing lists).
Well, I was stating our companies storage pattern here, and stating that
the default for UFS2 appears to agree with our patterns, and simply
making a generalization that many people might be in a similar situation
as our company. I don't think it's true that choosing the average
would yield 50% of the users having problems, but I see your point. In
fact, I think McKusick makes mention of a study of the average file size
being just under 16K (in 'Design and Implementation of the FreeBSD
Operating System'), so that is why they made the choice of a 16k block
size the default for UFS2.
> As I explained above, the default (which is one inode per
> 8 kbyte of data if you use the standard bsize/fsize) is
> choosen to be a conservative value, so that only very few
> people will need to lower it.
> > True - agreed, however I'm assuming most users of FreeBSD's UFS2
> > filesystem are in the 16k average filesize range.
> I don't think so. Nowadays, multimedia data makes a signi-
> ficant share of all data stored, and such files tend to be
> rather large. That's why they got their own file system in
> my server, so I can tune the newfs parameters for it, so I
> don't waste several Gbytes of space and don't have to wait
> half an hour for fsck.
I'm not sure where you came up with the 'multimedia data makes a
significant share of all data stored' statistic, but I just don't know
of a lot of companies that store multimedia files in such large
quantities to justify these claims. It's very possible though that I am
closed to the industry in which I am in, and so I don't see the 'other
> > If the average
> > users' average file size is larger, than the default newfs parameters
> > should be changed,
> As explained above, the newfs default parameters should be
> rather low, so they work for the "worst case". E.g. the
> source tree of FreeBSD RELENG_6 has indeed an average file
> size of 16082 bytes (I just looked a minute ago). But this
> is certainly not the typical use that takes up most of
> user's disk space. On my root file system (standard Free-
> BSD installation), the average file size is 42 Kbyte, on
> /var it's 37 kbyte, and on /usr it's 60 kbyte, even though
> it contains /usr/src and the ports collection (which is
> thousands of very small files).
> > > Of course, if you design a file system for different
> > > purposes, your requirements might be completely different.
> > > A maildir server or squid proxy server definitely requires
> > > a much higher inode density, for example.
> > If a filesystem were to be designed from scratch, having the inode
> > density variable or automatically grow to fulfill the needs, would be
> > the most efficient probably.
> Yes, I agree completely.
It would be interesting to do some sampling of a number of companies,
and see what their mean/median/mode filesize is on production data.
Eric Anderson Sr. Systems Administrator Centaur Technology
Anything that works is better than anything that doesn't.
More information about the freebsd-fs