ZSTD Project Weekly Status Update

Allan Jude allanjude at freebsd.org
Tue Aug 11 03:46:03 UTC 2020


This is the eighth weekly status report on the project to complete the
integration of ZSTD compression into OpenZFS.

https://github.com/openzfs/zfs/pull/10692 - I created some new tests
around the L2ARC to facilitate testing of ZSTD + L2ARC. These tests
found an issue (even with just lz4 compression) where if the
compressed_arc feature is disabled, the wrong size is used when
calculating the checksum of the buffer read back from the L2ARC,
resulting in a silent checksum failure. After the block from the L2ARC
fails to checksum, it is re-read from the main pool.

https://github.com/openzfs/zfs/pull/10693 - I have created a patch to
fix the issue between L2ARC and compressed_arc.

https://github.com/allanjude/zfs/commit/1f565ef0c6bd2e785fb3777c111184bb4bc551c4
- A followup to the rewritten version of the ZSTD feature activation
code. The handling of zfs_prop_set_special() was not actually setting
the property, so we return -1 so that the normal property setting
routine will be followed, in addition to the special handling.

https://github.com/allanjude/zfs/commit/8eac845a221952b3c9c52b4caf9be4bdf401e2b9
- Fixed an issue where if compression failed (this can be triggered by
"early abort", where the data is uncompressable and wont fit in the
output buffer that is 12.5% smaller than the input), it would skip the
encryption code block, which could result in data being written to the
L2ARC uncompressed and unencrypted.

Based on the above, I am considering that we might want to calculate the
checksum of the block after we re-transform it, and make sure it matches
the checksum in the blockpointer, if it does not, we just skip writing
to the L2ARC as if the block was ineligible for one of the normal
reasons. This would ensure we don't end up reading from the L2ARC only
to re-read from the main pool because the block did not survive the trip.

That leaves just the future proofing bits left (L2ARC, nop-write, etc
when newer ZSTD does not recompress the block in the same way), but that
specific bit doesn't need to block merging ZSTD support.

This project is sponsored by the FreeBSD Foundation.


On 2020-08-05 22:49, Allan Jude wrote:
> This is the seventh weekly status report on the project to integrate
> ZSTD into OpenZFS.
> 
> The compatibility related changes I created last week were refined and
> marged into the mainline branch.
> 
> Thanks to Brian Behlendorf for reviewing my proposed change for the zstd
> feature flag activation, and pointing out a better approach. I have
> reworked the patch based on his suggestion and prototype:
> 
> https://github.com/allanjude/zfs/commit/2508dafcec0a05d61afc5fbd5da356e201afbe97
> - Activate the per-dataset ZSTD feature flag as soon as the property is
> set to ZSTD. Before, simply doing `zfs set compression=zstd dataset`
> would not activate the feature flag. The feature flag would be activated
> when the first block that used ZSTD compression was written (see
> dsl_dataset_block_born()). This meant that if you set the property,
> exported the pool, the pool would import on systems with older versions
> of ZFS that did not support ZSTD, but would crash their userspace tools,
> because the property value was out of bounds.
> 
> 
> https://github.com/allanjude/zfs/commit/b8bec3fd2a8feb3a4de572eb15515d3764f92a35
> - I created a test that ensures that the feature flag is activated by
> `zfs set compression=zstd` and also ensures that the feature flag
> reverts to the 'enabled' state once the last dataset using zstd is
> destroyed.
> 
> 
> The next step is ensuring that ZSTD compression inter-operates properly
> with the L2ARC and Encryption etc.
> 
> I've also been discussing ideas with Brian about future-proofing, to
> handle the case where a newer version of ZSTD might compression the same
> input differently (better ratio), and how that would impact L2ARC,
> nop-write, etc. One idea (originally from Pawel Dawidek) is to do
> something similar to what encryption does, and split the checksum field.
> Using half to checksum the original data, and half the compressed
> version. This would allow ZFS to detect when the same content compressed
> differently (combined with the ZSTD version header in the compressed
> data), giving better compatibility as we upgrade ZSTD.
> 
> 
> This project is sponsored by the FreeBSD Foundation.
> 
> 
> 


-- 
Allan Jude

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 834 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20200810/083f3779/attachment.sig>


More information about the freebsd-fs mailing list