svn commit: r289405 - head/sys/ufs/ffs

Sat Oct 17 04:18:36 UTC 2015

On Fri, 16 Oct 2015, Warner Losh wrote:

>> On Oct 16, 2015, at 2:12 AM, Hans Petter Selasky <hp at selasky.org> wrote:
>>
>> On 10/16/15 08:21, Bruce Evans wrote:
>>> In addition, making the file contiguous in LBA space doesn't
>>>  improve the access times from flash devices because they have no seek
>>> time.
>>
>> This is not exactly true, like Bruce pointed out too. Maybe there should be a check, that if the block is too small reallocate it, else leave it for the sake of the flash. Doing 1K accesses versus 64K accesses will typically show up in the performance benchmark regardless of how fast the underlying medium is.
>
> But that’s not what this does. It isn’t the defrag code that takes the 2-8k fragments and squashes them into 16-64k block size for the device. This takes the larger blocks and makes sure they are next to each other. This takes large contiguous space (like 2MB) and puts as much as possible in a cylinder group. That’s totally useless on a flash drive.
>
> Since the sizes of the blocks are so large, moving them won’t change any benchmarks.

Not true.  First, the fragment size is actually 512-4K and the block size
is 4K-64K and the block size is that of ffs, not of the device (but the
ffs size should be chosen to be a multiple of the device size).

ffs_doreallocblks() doesn't take large (2MB) blocks and put them in in
a cylinder group.  It takes not very large (4K-64K) ffs blocks and tries
to make them contiguous.  Sometimes this gives large (2MB) contiguous
sets of blocks).  Cylinder groups are irrelevant except that contiguity
is impossible across them.

The pessimization that I was talking about was expanding the number of
i/o's by a large (anything more than 1%) factor by not grouping ffs
blocks.

I know too much about this from benchmarking and fixing the 10-20%
performance loss in soft updates for _reading_ of medium-size files
from pessimizal block allocation.  The loss was larger for soft
updates than for other cases because ffs used its delayed allocation
to perfectly pessimize the location of the indirect block.  10%-20%
is for files in /usr/src with an ffs block size smaller than its
current default of 32K, on a not very fast hard disks.  Most of
these files don't need an indirect block when the ffs block size is
32K or even 16K.  It is surprising that the effect is so large when
the block size is even 16K.  Then only files larger than 192K need
an indirect block.  The size of the effect also depends on the
amount of caching in disks.  With none, any seek kills performance
by much more than 10-20%.  With some, small seeks are almost free.
They just take another i/o to access the disk's cache provided the
disk does good read-ahead.

The bug seems to be that ffs_doreallocblks() does trimming for _all_
blocks that it looks at although most of them were just written with
a delayed write and never reached the disk.  Even the extra i/o's
for this are not good.  Hopefully the disk does nothing for trims
on unwritten blocks.

Bruce