ZFS: How to enable cache and logs.
Bob Friesenhahn
bfriesen at simple.dallas.tx.us
Fri May 13 14:13:52 UTC 2011
On Thu, 12 May 2011, Freddie Cash wrote:
>>
>> Zfs would certainly appreciate 128K since that is its default block size.
>
> Note: the "default block size" is a max block size, not an "every
> block written is this size" setting. A ZFS filesystem will use any
> power-of-2 size under the block size setting for that filesystem.
Except for file tail blocks, or when compression/encrpytion is used,
zfs will write full blocks as is configured for the filesystem being
written to (the current setting when the file was originally created).
Even with compression/encrpytion enabled, the input (uncompressed)
data size is the configured block size. The block needs to be read,
and (possibly) decompressed, and (possibly) decrypted so that it can
be checksummed, and any changes made. The checksum is based on the
decoded block in order to capture as many potential error cases as
possible, and so that the zfs "send" stream can use the same
checksums.
Zfs writes data in large transaction groups ("TXG") which allows it to
buffer quite a lot of update data (up to 5 seconds worth) before
anything is actually written. Even if the application should write
16kb at a time, zfs is likely to have buffered many times 128kb by the
time the next TXG is written.
If zfs goes to write a block and the user has supplied less than the
block size, and the file data has not been accessed for a long time,
or the system is under memory pressure so the file data is no longer
cached, then zfs needs to read (which includes checksum validation,
and possibly decompression and deencryption) the existing block
content so that it can fill in the gaps since it always writes full
blocks. The blocks are written using a Copy On Write ("COW")
algorithm so that the block is written to a new block location. If
the NFS client conveniently sent the data 128K at a time for
sequential writes then there is a better chance that zfs will be able
to avoid some heavy lifting.
Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
More information about the freebsd-fs
mailing list