Journalling FS and Soft Updates comparision

David Rhodus sdrhodus at gmail.com
Tue Feb 15 19:35:22 PST 2005


On Wed, 09 Feb 2005 20:23:19 -0700, Scott Long <scottl at freebsd.org> wrote:
> Loren M. Lang wrote:
> 
> > Traditionally, filesystems have been designed with the idea that the
> > data will always be written to disk safely and not much effort was put
> > into making then
> >
> > Journalling Filesystems and Soft Updates are two different techniques
> > designed to solve the problem of keeping data consistent even in the
> > case of a major interruption like a power blackout.  Both work solely on
> > the meta data, not the real data.
> 
> This isn't always true.  There are journaling implementations that
> journal the data as well as the metadata.
> 
> > This means increasing a file's size
> > is protected, but not neccessarily the data that's being written.  (Does
> > this also mean that the data will be written to free space before the
> > file size is increased so extraneous data won't be left in the file?)
> > Journally works be recording in a special place on the hard drive called
> > the journal every meta data change that it is about to execute before it
> > does it, then it updates all the meta data and finally marks the journal
> > completed.  Soft updates are simply a way to order meta data so that it
> > happens in a safe order.  An example is moving file a from directory x to
> > directory y would first delete file a from dir x, then add it to dir y.
> > If a crash happens in the middle, then the data becomes lost right?
> >
> 
> Part of the reordering of metadata in softupdates involves generating
> dependency graphs that prevent data loss like this.
>

In all filesystems there is a dependency graph for action ordering.  A
filesystem is a set of events and actions.  E.g.  if you think of a
filesystem in terms of this graph, there are node relationships with
commutative and assoviative properties.

Softupdates takes specific advantage of the commutative and
associative properties of these node updates and imposes queue order
semantics on scheduling of actions.
 
> > Now this shouldn't be a big deal since it's harmless to anything else,
> > just some free space is eaten up.  Since all meta data updates have this
> > same kind of harmless behavior, that why fsck can be done in the
> > background now instead of foreground.
> 
> The theory of softupdates is that whatever metadata made it to disk
> before shutdown/crash is consistent enough to be trusted after just a
> quick preen.  The rest of the background checking is just to clean up
> blocks that became unallocated but weren't committed.
> 
> >
> > Now comparing the two, perfomance wise journalling has an advantage
> > since every group of meta data updates that are written to the journal
> > at the same time can be reordered to optimize the disk performance.  The
> > disk head just has to move across the disk in order instead of seeking
> > back and forth.  Now this performance is usually lost because the
> > journal is constantly needing to be updated and it probably lies in one
> > small ares of the disk.  The other benefit of the journal is very quick
> > fsck times since all it has do to it see what the journal was updating
> > and make sure it all completed.  Soft updates still require a full fsck,
> > but since it can be done in the background unlike journalling, it mean
> > even faster startup time, but more cpu and i/o time spent on it.  Now if
> > the journal of a journalling fs could be kept somewhere else, say, in
> > some kind of nvram, then journalling might be overall more efficient as
> > far as disk i/o and cpu time than soft updates.
> 
> Performance between softupdates and journalling is still hotly debated,
> and your statements border on the 'flaimbait' side of the argument.
> 
> >
> > I'm mainly just trying to get an understanding of these two techniques,
> > not neccessarily saying one is better.  In the real world, it's probably
> > very dependent on many other things like lot of random access vs.
> > sequential, many files and file ops per seconds, vs. mostly read-only
> > with noatime set, etc.
> 
> Softupdates really aren't a whole lot different from journalling.  Both

No, journalling and soft updates are orthogonal technologies; they
do not solve the same problem space, although there is some minor
overlap.  Soft updates is not able to solve all the problems which
journaling can.

> turn metadata operations into a sequence of ordered atomic updates.  The
> only difference is that journalling writes these updates to the on-disk
> journal right away and then commits them later on, and softupdates keeps
> (most of) them in RAM and then commits them later on.  You are correct
> that journalling has a key advantage in that a fsck, either foreground
> or background, is not strictly required after an unexpected shutdown.
> For further information, I'd suggest reading:
> 
> http://www.mckusick.com/softdep/index.html
> 
> Scott

-- 
                                            -David
                                            Steven David Rhodus
                                            <drhodus at machdep.com>


More information about the freebsd-fs mailing list