RE: ZFS reservations for type=volume

From: Michael Jung <mikej_at_paymentallianceintl.com>
Date: Thu, 21 Apr 2022 23:39:55 UTC
Thank you Rich, and Allan I should have paid closer attention and thought about it just a little more.

df does not help on type=volume as that is normally not a ZFS mounted filesystem just a block device but it can be (see below)

Hopefully this will help someone else.

root@draid:/usr/src/stand/libsa/zfs # zfs list -o used  raid-5400-1
USED
18.0T
root@draid:/usr/src/stand/libsa/zfs # zfs list -o used  raid-5400-1/esxi-store1
USED
16.9T                                                                     <- My reservation amount
root@draid:/usr/src/stand/libsa/zfs #



root@draid:/usr/src/stand/libsa/zfs # zfs list -o avail  raid-5400-1
AVAIL
2.08T
root@draid:/usr/src/stand/libsa/zfs # zfs list -o avail  raid-5400-1/esxi-store1
AVAIL
18.5T

I guess as Mr. Obvious would say “ZFS <command> “ is the dataset – “Zpool <command>” is the pool.

root@draid:/usr/src/stand/libsa/zfs # zfs list
NAME                                           USED  AVAIL  REFER  MOUNTPOINT
raid-5400-1                                  18.0T  2.08T   236K  /raid-5400-1        <- Pool
raid-5400-1/esxi-store1          16.9T  18.5T   522G  -                              <- type=volume with reservation
raid-5400-1/unitrends1           1.08T  2.43T   752G  -                              <- type=volume without reservation

Note: I have simply not reserved all 18.5T yet of raid-5400-1/esxi-store1 but I am doing that now – never had a reason to use reservations in the past but I have a test lab @home with a lot of visualization all being migrated to ISCSI targets and I certainly don’t want to have to worry about thin <pick you guest type> on top of thin LUNS so I’m starting with some disposable pools to learn.  When I start shuffling the real data around (~50TB+) I don’t want unforeseen issues ;-) – No backups for everything $$$  but its close enough ☺

--mikej




CONFIDENTIALITY NOTE: This message is intended only for the use
of the individual or entity to whom it is addressed and may
contain information that is privileged, confidential, and
exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby
notified that any dissemination, distribution or copying
of this communication is strictly prohibited. If you have
received this transmission in error, please notify us by
telephone at (502) 212-4000 or notify us at: PAI, Dept. 99,
2101 High Wickham Place, Suite 101, Louisville, KY 40245



From: Rich [mailto:rincebrain@gmail.com]
Sent: Thursday, April 21, 2022 5:59 PM
To: Alan Somers <asomers@freebsd.org>
Cc: Michael Jung <mikej@paymentallianceintl.com>; freebsd-fs <freebsd-fs@freebsd.org>
Subject: Re: ZFS reservations for type=volume

(My FBSD dev env isn't booted, but Linux and FBSD are the same in this regard...)

# zpool list workspace
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
workspace  35.4T  12.9T  22.5T        -         -     3%    36%  1.00x  DEGRADED  -
# zfs create workspace/testme
# zfs list workspace workspace/testme
NAME        USED  AVAIL     REFER  MOUNTPOINT
workspace  13.0T  21.5T     48.9G  /workspace
workspace/testme    96K  21.5T       96K  /workspace/testme
# df -h /workspace/testme /workspace/
Filesystem        Size  Used Avail Use% Mounted on
workspace/testme   22T  128K   22T   1% /workspace/testme
workspace          22T   49G   22T   1% /workspace
# zfs set reservation=1T workspace/testme
# zpool list workspace
NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
workspace  35.4T  12.9T  22.5T        -         -     3%    36%  1.00x  DEGRADED  -
# zfs list workspace
NAME        USED  AVAIL     REFER  MOUNTPOINT
workspace  14.0T  20.5T     48.9G  /workspace
workspace/testme    96K  21.5T       96K  /workspace/testme
# df -h /workspace/testme /workspace/
Filesystem        Size  Used Avail Use% Mounted on
workspace/testme   22T  128K   22T   1% /workspace/testme
workspace          21T   49G   21T   1% /workspace
#

That is, AIUI, reservation counts against the USED/AVAIL/FREE shown by "zfs list" or "df" of everything not the reservation-holder , not against the "ALLOC" shown by zpool list...unless it's actually allocated, of course.

- Rich

On Thu, Apr 21, 2022 at 5:46 PM Alan Somers <asomers@freebsd.org<mailto:asomers@freebsd.org>> wrote:
Isn't 1.52T the "AVAIL" value, and 24.6T is the "USED"?

On Thu, Apr 21, 2022 at 2:56 PM Michael Jung
<mikej@paymentallianceintl.com<mailto:mikej@paymentallianceintl.com>> wrote:
>
> Ok that makes sense... so even though running ZFS since 9.x I'll ask the newbie question..
>
> If it contributes to the parents "used" value why is not reflected here in "USED" or in "AVAIL"?
>
> I would expect USED to be my reservation amount of 15.6T + whatever space was
> being used by other things on the pool. "USED" still sits at 1.52T.
>
> NAME TYPE USED AVAIL RATIO COMPRESS RESERV REFRESERV VOLSIZE
> raid-5400-1 24.6T 1.52T 23.1T - - 2% 6% 1.00x ONLINE
>
> Thanks again.
>
> -----Original Message-----
> From: Alan Somers [mailto:asomers@freebsd.org<mailto:asomers@freebsd.org>]
> Sent: Thursday, April 21, 2022 4:40 PM
> To: Michael Jung <mikej@paymentallianceintl.com<mailto:mikej@paymentallianceintl.com>>
> Cc: freebsd-fs <freebsd-fs@freebsd.org<mailto:freebsd-fs@freebsd.org>>
> Subject: Re: ZFS reservations for type=volume
>
> A dataset's reservation is local. It doesn't contribute to its parent's reservation. Otherwise, you wouldn't be able to separately set a reservation on the parent. But it _does_ contribute to the parent's "used" value. In that way, you're prevented from reserving too much data on the parent's children.
> -Alan
>
> On Thu, Apr 21, 2022 at 2:28 PM Michael Jung <mikej@paymentallianceintl.com<mailto:mikej@paymentallianceintl.com>> wrote:
> >
> > I have a zfs block dataset raid-5400-1/esxi-store1 that I share as an
> > iscsi target and that works great. I have set a reservation
> >
> > on that block device equal to its size so that is not sparse and thus
> > while I could over provision guests on the provided LUN,
> >
> > the storage presented as the LUN capacity will always be available. At least this is what I want to achieve.
> >
> >
> >
> > What I find strange is that the reservation does not seem to be
> > applied to the ZFS pool ‘raid-5400-1’. Do you really need
> >
> > to set your maximum reservation at the pool level, and then apply
> > reservations to all datasets on that volume? And if so
> >
> > I would assume you could never set reservations for datasets totaling more than what was reserved for the pool ‘raid-5400-1’.
> >
> >
> >
> > I could build out a test environment and figure out constraints but I’d really like to know the “how it is supposed to work”
> >
> > not the “how I find it to work”.
> >
> >
> >
> > Thanks in advance.
> >
> >
> >
> > FreeBSD 14.0-CURRENT #4 main-n253875-8e72f458c6d:
> >
> >
> >
> > (this is a raidz2 pool – not my draid pool)
> >
> >
> >
> >
> >
> > root@draid:/usr/src/contrib/bearssl # zfs list -o
> > name,type,used,avail,ratio,compression,reservation,refreservation,vols
> > ize raid-5400-1
> >
> > NAME TYPE USED AVAIL RATIO COMPRESS RESERV REFRESERV VOLSIZE
> >
> > raid-5400-1 filesystem 18.0T 2.08T 1.36x on none none - <- no reservation @pool
> >
> > root@draid:/usr/src/contrib/bearssl #
> >
> >
> >
> > root@draid:/usr/src/contrib/bearssl # zfs list -o
> > name,type,used,avail,ratio,compression,reservation,refreservation,vols
> > ize raid-5400-1/esxi-store1
> >
> > NAME TYPE USED AVAIL RATIO COMPRESS RESERV REFRESERV VOLSIZE
> >
> > raid-5400-1/esxi-store1 volume 16.9T 18.5T 1.78x zstd 15.6T 16.9T 15.6T <- reservation @dataset
> >
> >
> >
> > root@draid:/usr/src/contrib/bearssl # zpool list
> >
> > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
> >
> > ccache 9.50G 9.10G 406M - - 88% 95% 1.00x ONLINE -
> >
> > raid-5400-1 24.6T 1.52T 23.1T - - 2% 6% 1.00x ONLINE - <- Free does not reflect reservation @pool
> >
> > tank 18.5T 605G 17.9T - - 0% 3% 1.00x ONLINE -
> >
> > zfsroot 103G 33.3G 69.7G - - 31% 32% 1.00x ONLINE -
> >
> > root@draid:/usr/src/contrib/bearssl #
> >
> >
> >
> >
> >
> > CONFIDENTIALITY NOTE: This message is intended only for the use of the
> > individual or entity to whom it is addressed and may contain
> > information that is privileged, confidential, and exempt from
> > disclosure under applicable law. If the reader of this message is not
> > the intended recipient, you are hereby notified that any
> > dissemination, distribution or copying of this communication is
> > strictly prohibited. If you have received this transmission in error,
> > please notify us by telephone at (502) 212-4000 or notify us at: PAI,
> > Dept. 99,
> > 2101 High Wickham Place, Suite 101, Louisville, KY 40245
> >
> >
> >
> >
> >
> > Disclaimer
> >
> > The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
> >
> > This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
>
>
>
>
> CONFIDENTIALITY NOTE: This message is intended only for the use
> of the individual or entity to whom it is addressed and may
> contain information that is privileged, confidential, and
> exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby
> notified that any dissemination, distribution or copying
> of this communication is strictly prohibited. If you have
> received this transmission in error, please notify us by
> telephone at (502) 212-4000 or notify us at PAI, Dept. 99,
> 2101 High Wickham Place, Suite 101, Louisville, KY 40245
>
>
> Disclaimer
>
> The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
>
> This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.