Expanding storage in a ZFS pool using draid

From: Freddie Cash <fjwcash_at_gmail.com>
Date: Wed, 03 May 2023 16:08:50 UTC
I might be missing something, or not understanding how draid works "behind
the scenes".

With a ZFS pool using multiple raidz vdevs, it's possible to increase the
available storage in a pool by replacing each drive in the raidz vdev.
Once the last drive is replaced, either the extra storage space appears
automatically, or you run "zpool online -e <poolname> <disk>" for each disk.

For example, if you create a pool with 2 raidz vdevs using 6x 1 TB drives
per vdev you'll end up with ~ 10 TB of space available to the pool.  Later,
you can replace all 6 drives in one raidz vdev with 2 TB drives, and get an
extra 5 TB of free space in the pool.  Later, you can replace the 6 drives
in the other raidz vdev with 2 TB drives, and get another 5 TB of free
space in the pool.

We've been doing this for years, and it works great.

When draid became available, we configured our new storage pools using that
instead of multiple raidz vdevs.  One of the pools uses 44x 2 TB drives,
configured in a draid pool using:
mnparity: 2
draid_ndata: 4
draid_ngroups: 7
draid_nspares: 2

IIUC, this means the drives are configured in 7 groups of 6, using 4 drives
for data and 2 for parity in each group, with 2 drives configured as spares.

The pool works great, but we're running out of space.  So, we replaced the
first 6 drives in the pool with 4 TB drives, expecting to get an extra
4*4=16 TB of free space in the pool.  However, to our great surprise, that
is not the case!  Total storage capacity of the pool has not changed.  Even
after running "zpool online -e" against each of the 4 TB drives.

Do we need to replace EVERY drive in the draid vdev in order to get extra
free space in the pool?  Or is there some other command that needs to be
run to tell ZFS to use the extra storage space available?  Or ... ?

Usually, we just replace drives in groups of 6, going from 1 TB to 2 TB to
4 TB as needed.  Having to buy 44 (or 88 in our other draid-using storage
server) and replace them all at once is going to be a massive (and
expensive) undertaking!  That might be enough to rethink how we use draid
going forward.  :(

Freddie Cash