on st_blksize value

Wed Mar 24 17:02:44 UTC 2010

On Wed, 24 Mar 2010, Andrew Snow wrote:

> Not strictly true: in ZFS the recordsize setting is for the maximum size of a 
> record, it can still write smaller than this.  If you overwrite 1K in the 
> middle of a 128K record then it should just be writing a 1K block.  Each 
> block has its own checksum attached to it so there's no need to recalculate 
> checksums for data that isn't changing.

This is not true.  In fact, simple testing will show that it is 
clearly not true.

ZFS will always write recordsize blocks except that the tail block is 
allowed to be smaller.  If compression is enabled, the block is stored 
in its compressed size, so the amount actually stored on disk may be 
less than the established recordsize.

Due to ZFS's read-modify-write strategy, it is important to 
performance that the data to be modified be cached in the ARC.  There 
will still be write amplification if the update size is smaller than 
the recordsize.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/