PATCH: Forcible delaying of UFS (soft)updates

Marko Zec zec at tel.fer.hr
Thu Apr 17 05:03:57 PDT 2003


Ian Dowse wrote:

> In message <3E9C517B.6039679A at tel.fer.hr>, Marko Zec writes:
> >Tempted by a lot of opposition to the concept of (optionally) ignoring
> >fsync() calls when running on battery power, I wonder what effect the
> >concept of unconditional delaying of _all_ disk updates by ATA-disk
> >firmware will make on FS consistency in case of system crash or power
> >failure? I do not want to imply such a concept is a priori bad, however
> >I fail to realize its advantages over OS-controlled delaying of disk
> >synching.
>
> Note that the ATA "delayed write" mechanism only delays writes while
> the disk is spun down; at other times there is no change in behaviour.
> Since the disk only spins down after it has been idle for a time,
> it is very unlikely that the disk is left in an inconsistent state
> while it is stopped.
>
> Just after the disk spins up there is a small window where the
> cached writes get written out in a burst. Due to the amount of
> cached data and the probable re-ordering of writes, the disk is
> quite likely to be in an inconsistent state during this flurry of
> writes, but the window is short so it is probably not a big issue
> in practice.
>
> The main advantage of using the ATA delayed write mechanism is that
> the disk itself can take advantage of knowing whether or not it is
> spinning, whereas the OS does not have that information.

The OS _does_ know (approximately) when the disk is spinning and when not.
For example, if the disk is configured to stop spinning immediately after
the last I/O operation, the OS can safely assume 10 or more seconds
afterwards the spinning will be stopped. The OS only has to keep record (in
form of timestamp or something similar) when it has issued the last I/O
request to the disk. In my patch this is accomplished using the stratcalls
marker, which is increased every time the strategy routine of the ATA disk
driver is invoked. Therefore the OS can also successfully coalesce the
pending disk updates with other outstanding I/O disk operations, which are
typically reads of uncached sectors or VM swapping.

> The downside
> is that it is not guaranteed that fsync'd data gets written to disk
> immediately, though in practice the disk tends to be spinning when
> the fsync is performed due to the previous accesses. I've been using
> ATA delayed writes on a few laptops for over a year and it has never
> caused me any problems - it generally works just right in the sense
> that the disk remains spun down when the machine is mostly idle,
> and spins up when you save files from an editor etc.

I agree the ATA delayed writes is a great functionality that can help save
battery power. I just want to point out that it can suffer from the same
consistency problems as the model of OS controlled delayed synching combined
with null fsync() processing. However, if the OS controls the delaying of
updates, you can turn on or off normal fsync() semantics as desired. With
delaying writes in ATA firmware, you simply do not have the choice :)
Cheers,

Marko

> Doing the write delaying in the OS is always going to be a tradeoff
> between excessively delaying writes when the machine is busy and
> maximising the time between spin-ups when idle, though obviously
> there is more control possible over which writes get delayed and
> which don't.
>
> Ian




More information about the freebsd-fs mailing list