Strange IO performance with UFS

Tue Jul 8 22:30:13 UTC 2014

On  5 Jul, Konstantin Belousov wrote:
> On Sat, Jul 05, 2014 at 06:18:07PM +0200, Roger Pau Monn? wrote:

>> As can be seen from the log above, at first the workload runs fine,
>> and the disk is only performing writes, but at some point (in this
>> case around 40% of completion) it starts performing this
>> read-before-write dance that completely screws up performance.
> 
> I reproduced this locally.  I think my patch is useless for the fio/4k write
> situation.
> 
> What happens is indeed related to the amount of the available memory.
> When the size of the file written by fio is larger than the memory,
> system has to recycle the cached pages.  So after some moment, doing
> a write has to do read-before-write, and this occurs not at the EOF
> (since fio pre-allocated the job file).

I reproduced this locally with dd if=/dev/zero bs=4k conv=notrunc ...
For the small file case, if I flush the file from cache by unmounting
the filesystem where it resides and then remounting the filesystem, then
I see lots of reads right from the start.

> In fact, I used 10G file on 8G machine, but I interrupted the fio
> before it finish the job.  The longer the previous job runs, the longer
> is time for which new job does not issue reads.  If I allow the job to
> completely fill the cache, then the reads starts immediately on the next
> job run.
> 
> I do not see how could anything be changed there, if we want to keep
> user file content on partial block writes, and we do.

About the only thing I can think of that might help is to trigger
readahead when we detect sequential small writes.  We'll still have to
do the reads, but hopefully they will be larger and occupy less time in
the critical path.

Writing a multiple of the filesystem blocksize is still the most
efficient strategy.