Free space in ZFS

Sat Jun 16 08:35:28 UTC 2012

On 06/15/2012 04:02 PM, John Levine wrote:
> I made a three disk zraid ZFS pool yesterday from three new 1 TB
> disks, which I'm using for backup.  Then I did a backup and made a zfs
> volume.  The free space numbers don't make sense.  This is on 8.3, ZFS
> version 15.
> 
> # zpool list
> NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> backup2  2.72T   310G  2.42T    11%  ONLINE  -
> 
> Given that it's zraid, the total available space should be a little
> under 2TB since the third disk is for parity.  But zpool gives me a
> total of 2.72T, as though the third disk was for data.

raidz does not operate entirely like a traditional raid5. It stores
enough redundant information to survive a full disk failure, but that's
where the similarity ends.

When you write to a raid5, the data is striped in even strides across
n-1 disks, and parity is written to the remaining disk. The layout is
very rigidly structured, such that you can always determine where a
particular piece of data will end up by performing simple arithmetic.

When you write data to a raidz1, a single ZFS data block is chopped up
into n-1 equal-sized pieces (plus 1 piece for parity), and stored
wherever it will fit inside the pool. The storage allocator will make
sure that each piece ends up on a separate physical disk, but that's the
only restriction on placement.

So, when looking at the zpool itself, you see raw capacity that is
burned through at a rate of four-thirds (for a 4-disk raidz) as you
commit data to the pool.

> # zfs list
> NAME                USED  AVAIL  REFER  MOUNTPOINT
> backup2             206G  1.58T  31.3K  /backup2
> backup2/20120615    206G  1.58T   206G  /backup2/20120615
> 
> Well, that makes more sense, total is 1.78Tb.

...but, when looking at the dataset itself, you see how much
(compressed, deduped) data is present (since you don't care about parity
at this level), and how much more data the allocator predicts you can
safely store on this dataset (which is affected by things like
compression, deduplication, reservations, and quotas).

> # df -g
> Filesystem        1G-blocks Used Avail Capacity  Mounted on
> backup2                1618    0  1618     0%    /backup2
> backup2/20120615       1825  206  1618    11%    /backup2/20120615
> 
> Now the total is 1.82Tb.  Huh?  The backup filesystems are compressed,
> but surely they're showing me the actual size, not the uncompressed
> size.  Or are they?

Don't bother with df. Because df was designed in the era of static
filesystems that never change capacity and always write verbatim, zfs
has to be 'creative' to represent the size in a manner that would be in
any way useful to a user. It doesn't always work. Google 'ZFS versus
df'[1] for more information.

Hope this helps!

[1] https://duckduckgo.com/?q=zfs+versus+df

-- 
Fuzzy love,
-CyberLeo
Technical Administrator
CyberLeo.Net Webhosting
http://www.CyberLeo.Net
<CyberLeo at CyberLeo.Net>

Furry Peace! - http://wwww.fur.com/peace/