Re: ZFS pool balance and performance

From: Frank Leonhardt <freebsd-doc_at_fjl.co.uk>
Date: Mon, 25 Aug 2025 09:59:13 UTC
On 25/08/2025 07:18, Gerrit Kühn wrote:
> Am Sun, 24 Aug 2025 10:41:40 -0400
> schrieb Chris Ross<cross+freebsd@distal.com>:
>
>>> I'm unaware of any good means to rebalance allocation
>>> "in-place."
>> Yeah, google search suggested there isn’t any way to rebalance.
> I think a faced a badly balanced pool after adding more disks many years
> ago, and just wrote the files to themselves using rsync or similar to make
> zfs rebalance them. After doing a bit of google, I just found this script
> which appears to to something similar (although it creates a real copy
> first and does not write files to themselves directly):
> https://github.com/markusressel/zfs-inplace-rebalancing/blob/master/README.md
Looking at that script it's basically doing a "cp -a" on each file and 
then paranoid checking it's copied correctly  before removing the 
original (so much for trusting ZFS :-) )

A few thoughts:


1) If you'd added a vdev and wanted to spread stuff to it without 
waiting for it to migrate naturally, this is as good a way as any, 
although I'd also consider copying ZFS datasets rather than individual 
files.

2) A quick 'C' program that flipped a byte in every file block and then 
flipped it back would be more efficient. (If you're not a 'C' programmer 
I can knock one up). There's a problem as we don't know how long a block 
is (somewhere between 512 bytes and 128K) and presumably ZFS won't CoW a 
block with identical contents. But it'd be good enough.

3) If your zpool is anything like full, this is going to make file 
fragmentation far worse.

4) If the imbalance is caused by ZFS choosing to migrate new CoW data to 
one a particular dataset (or away from another) then this will only 
encourage it to continue. ZFS is designed to balance writes between 
vdevs on a zpool for efficiency, so if you start with two balanced vdevs 
they would remain balanced. If you added a vdev and continued writing to 
the zpool, it would tend towards balance over time. It favours vdevs 
with the most space for a write if all other things are equal, so normal 
usage should drift data anyway from the existing vdevs and on to the 
new. Key being "all other things are equal".

So I think you need to find out why ZFS has decided your zdevs are more 
efficient unbalanced (whether it's right or not). More writes are just 
going to make matters worse. If it's changed it's mind, normal use will 
balance it over time.

Regards, Frank.