PATCH: Forcible delaying of UFS (soft)updates

Marko Zec zec at tel.fer.hr
Thu Apr 17 12:43:48 PDT 2003


On Thursday 17 April 2003 18:54, Terry Lambert wrote:
> Marko Zec wrote:
> > Ian Dowse wrote:
> > > Note that the ATA "delayed write" mechanism only delays writes while
> > > the disk is spun down; at other times there is no change in behaviour.
> > > Since the disk only spins down after it has been idle for a time,
> > > it is very unlikely that the disk is left in an inconsistent state
> > > while it is stopped.
>
> I'm wondering if the ATA "delayed write" actually does this, or if
> it merely relaxes the cache restrictions, without retaining the
> ordering enforcement.
>
> I suspect that it does not retain the ordering enforcement, as
> there is no way to disconnect on a tagged queue write, because
> you must issue a request for status, and it can't be done as a
> seperate ATA operation (see the posts by the Maxtor employee, on
> and around January 20th of this year to the -FS list for details).
>
> You are much better off accumulating requests in the kernel in
> buffers, and then using the normal write mechanism to push them
> out to the drive ordered (IMO). 

That is precisely what the original OS-controlled delayed synching patch does 
:)

> This implies a barrier and new
> code above the bwrite interface, to keep the buffers from getting
> locked, and stalling you applications in user space.
>
> A problem I see here is that swap is on a totally different path,
> and in a different area of the disk (practically guaranteeing a
> seek, and a track buffer invalidation on the disk), even if you
> could cause swapping to be delayed (I don't think you can; FreeBSD
> aggressively uses memory, and so when you need to swap, you *need*
> to swap).
>
> > The OS _does_ know (approximately) when the disk is spinning and when
> > not. For example, if the disk is configured to stop spinning immediately
> > after the last I/O operation, the OS can safely assume 10 or more seconds
> > afterwards the spinning will be stopped. The OS only has to keep record
> > (in form of timestamp or something similar) when it has issued the last
> > I/O request to the disk. In my patch this is accomplished using the
> > stratcalls marker, which is increased every time the strategy routine of
> > the ATA disk driver is invoked. Therefore the OS can also successfully
> > coalesce the pending disk updates with other outstanding I/O disk
> > operations, which are typically reads of uncached sectors or VM swapping.
>
> This is useful, but not enough.  You need to actually communicate
> the information above the block I/O layer, to the soft updates.  I
> think, effectively, what you actually want to do is to stop the
> soft updates clock

Hey man, that's exactly what I have done in my patch ("stopping the soft 
updates clock" as you call it). On the block I/O layer I'm only checking if 
the disk is active or not... Are you sure you have checked out the patch / 
code?

> , rather than trying to play stupid disk tricks
> with timers, etc., above and beyond what you have to do.  I can see
> it being useful on SCSI disks, as well, particularly where there are
> temperature issues.  Though in that case, you probably are more
> memory starved than anything, and it will end up doing you no good.
>
> > I agree the ATA delayed writes is a great functionality that can help
> > save battery power.
>
> I don't; only if the write order is maintained is it "great".
>
> > I just want to point out that it can suffer from the same
> > consistency problems as the model of OS controlled delayed synching
> > combined with null fsync() processing. However, if the OS controls the
> > delaying of updates, you can turn on or off normal fsync() semantics as
> > desired. With delaying writes in ATA firmware, you simply do not have the
> > choice :)
>
> I think people are confusing fsync() with syncd at this point.  8-(.
>
> -- Terry



More information about the freebsd-fs mailing list