journaling fs and large mailbox format

Mike Meyer mwm-keyword-hackers.e471b2 at mired.org
Thu Sep 29 20:13:11 PDT 2005


In <433C9A64.3030602 at spintech.ro>, Alin-Adrian Anton <aanton at spintech.ro> typed:
> I run out of inodes with Maildir, and there were just a few hundred 
> accounts. Outlook ppl tend to "leave their messages on server if they 
> are not 7 days old" and this brings Christmas every day.

How many files was that, and on how big a file system? Something seems
out of kilter.

> Mike Meyer wrote:
>  > The solution isn't to avoid Maildir/mh - the solution is to tune the 
> file system for the expected usage.
> 
> Well, I dislike throwing up my problems to a superior level, and act 
> like it was brilliant. It was just running away from the issue, instead 
> of dealing with it. More exactly, storage problems are database theory. 
> Storing the mail is a classic database problem. Throwing this up to the 
> filesystem level is not an elegant way of dealing with it, because now 
> the filesystem must solve it, and this imposes new restrictions to the 
> filesystem.

I hate to tell you this, but a file system *is* a database. Unix file
systems tend to be pretty simple databases, but that's not true on all
systems. Using the file system in lieue of a more complicated database
- if it will work - is a time-honored unix technic. I keep a couple of
gig of mail archived, and let the file system deal with sorting it out
by date.

Someone's got to solve the problems. If you can find an existing tool
to do it for you, that's brilliant, whether the tool is a file system,
a database, or a custom application. But there are tradeoffs to each
such solution, and you're the only one who can decide if a specific
solution is right for you or not.

> I agree, B-trees are for database index problems, and not only, however, 
> just imagine what would happen if mySQL or PostgreSQL would throw away 
> their database indexing/locking issue up to the filesystem level? It 
> would be a total hoax, one would need separate filesystem tuning for 
> mysql, one for postresql, one for mail, one for apache, etc.. This just 
> brings headaches and unnecessarry restrictions to the partitioning schema.

That depends on the underlying file system, and how flexible it
is. Apache, mail, etc tend to work ok with a standard Unix file
system. Database have more stringent requirements - including
performance constraints. I remember commercial databases recommending
that you hand them raw disk devices, and skip the OS file system
manipulations completely. File systems have gotten a lot better since
then, so they may not do that any longer.

> This is why something like dbmail seems more appropiate in my opinion 
> (conceptually).

Well, it's more appropriate for some uses. I punted on mbox format in
the 80s, when I realized that I could use stock unix commands for
manipulating single messages if I used mh mailboxes. This was a major
win, as there weren't very good tools for manipulating single messages
in an mbox. If your usage is restricted to people doing POP/IMAP, then
dbmail would certainly work better. The downside is that you can't use
Unix tools to manipulate messages. The upside (?) is that you can use
SQL to manipulate messages, which may be a major win. I'm certainly
going to check it out.

	Thanks,
	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.


More information about the freebsd-hackers mailing list