maildir with softupdates

Terry Lambert tlambert2 at mindspring.com
Thu Jul 24 00:34:05 PDT 2003


Bill Moran wrote:
> Attila Nagyn wrote:
> > Is this statement still valid?
> >
> > "ext3 is unsafe for maildir, and with softupdates, so is ffs."
> > http://www.irbs.net/internet/postfix/0202/0358.html
> 
> Yes,

I don't think this is true for Soft Updates, unless you take your
next statement into account, since a proper MTA will fsync data
before saying "250 OK" because its programmer will understand
POSIX semantics.  EXT3 violates POSIX semantics by default (async),
specifically "SHALL be updated" vs. "SHALL be marked for update",
while Soft Updates doesn't.


> It's also true that any form of write-caching is unsafe, so disable
> the caches on your SCSI and ATA hard drives.  Simply accept the
> terrible performance hit if you want super-reliability.

SCSI is generally not a problem here, because SCSI supports the
ability to disconnect writes.  Therefore it doesn't have to cheat
with its cache, the way ATA does, in order to ensure that writes
are committed to stable storage in the order requested: it merely
informs the host of the disconnected write status, and the host
software takes care to not issue an out-of-order write request to
the tagged command queue.  In ATA, the lack of ability to support
disconnected writes means that either the drives lie to the caller,
or all writes end up serialized through a single request window,
instead of one the size of the tagged command queue depth (reads
simultaneously outstanding are not a problem for ATA, since they
do not require "no bus disconnect until completion" like writes).

In general, some lazy manufacturers do not implement the proper
defaults, but they are required by the SCSI II and SCSI III (not
final) specification to provide an override for the behaviour in
mode page 2.  If they don't, you can get them kicked off the GSA
schedule and force them to lose all their large government contracts,
so they are pretty careful about adherence.


> Also, make sure you have redundant power supplies, UPSes and a diesel
> generator out back to cover power problems.

If things are built correctly, you don't need these, but these days,
hardware is not built correctly, even sCSI hardware, and it's possible
that you could lose power during a write and lose sectors other than
the one you though you were writing, as it sucks everything into the
track buffer to do a read-modify-write of a track at a time.  8-(.


> In reality, anything comes with a certain amount of risk, and that
> statement is too vague to be useful.

Sure; you also forgot terrorists blowing up the data center where
you computer is housed, and a total collapse of the government in
the country where it's located, leading to anarchy and looting of
aluminum and copper from the wires that make up the power grid, in
order to appease the god Trogdor The Burninator.  8-).


> To my knowledge, ext3 is not unsafe by nature, it is simply unsafe
> by default because the default mount is async - which will generally
> be corrupted in the event of hardware failure.
> 
> UFS+softupdates generally survives hardware failure without corruption,
> although it has a funny habit of losing files that were saved right
> before the failure.  Result being that you could lose emails.

Unless your MTA fsync's and waits for the result before saying
"250 OK".  On EXT3, it would need to also "fsync" every directory
between the queue directory and the root (which is why qmail is
so slow, in general, on POSIX compliant systems, which already
guarantee ordering of commits for metadata).


> However ... even a sync mount can become corrupt in the event of
> hardware failure, although it's much less likely.

Yep; and don't forget those pesky Ebola victims exploding and shorting
out the entire RAID array... ;^) ;^).


> So you need to determine the risk level you're willing to accept as
> well as the performance you require.  And you probably need to do more
> research than accepting that one-line statement, as it's too vague to
> properly describe the potential risk/benefits.

It's always a question of risk.  If the business is designed
properly, what's actually happening is that you are betting your
job vs. the risk involved, and hoping you win the bet.  Some
people are happy with paying craps for their money; others need
a certain amount of security, and other want a government guarantee.

For something like bet-your-business-it-works-email-services, my
own personal risk tolerance is low enough that I would eat almost
any performance hit in order to obtain guaranteed delivery, because
in that case, a single email lost could be as bad for my business
as a fire in the copier with a missed 911 call to the fire department.

Consider how long your average 1970's business would stay in business
without their telephone, and you get the idea: it's all about keeping
alive the communications channel between you and your customer.


> Also, this is off-topic for -CURRENT, please remove -CURRENT from the
> CCs if you respond.  I'm redirecting to -QUESTIONS for future discussion.

Replied to questions, per your request, but probably -performance
would have been a better overall choice.

-- Terry


More information about the freebsd-questions mailing list