Re: RFC: Patch for mountd to handle a database for exports

From: Peter Eriksson <pen_at_lysator.liu.se>
Date: Fri, 21 Jul 2023 08:37:12 UTC
Regarding the potential for speed improvement here is a quick little benchmark…

> # wc -l /etc/zfs/exports
>    13116 /etc/zfs/exports


With the original FreeBSD 13.2 code:

> # /usr/bin/time zfs share -a
>        40.08 real        34.00 user         6.04 sys
> # /usr/bin/time zfs share -a
>        40.15 real        34.06 user         6.04 sys


With the DB code in use:

> # service mountd stop ; mv /usr/sbin/mountd.DB /usr/sbin/mountd ; mv /lib/libzfs.so.4.DB /lib/libzfs.so.4
> # rm -f /etc/zfs/exports.db
> # /usr/bin/time zfs share -a
>         2.75 real         0.88 user         1.87 sys
> # /usr/bin/time zfs share -a
>         2.77 real         1.01 user         1.75 sys


This translates into a shorter boot time if nothing else...

- Peter


> On 21 Jul 2023, at 10:22, Peter Eriksson <pen@lysator.liu.se> wrote:
> 
> Hi!
> 
> Here’s a couple of updated patches with some bug fixes. I’ve also moved the database location for ZFS to /etc/zfs/exports.db to mirror the current location.
> 
> I also changed the patch for mountd so that for each “exports” source specified it first tries to open <path>.db which I think makes it easier to use (no need for the “-D” option and it supports multiple sources). Cleaner I think.
> 
> 
> Btw, you can create the database manually from /etc/zfs/exports using “makemap” like this:
> 
>  makemap -f btree /etc/zfs/exports.db </etc/zfs/exports
> 
> 
> Known “bugs”:
> “V4:” 
> Even though you could create an /etc/exports.db with the contents of /etc/exports it will fail with the “V4:” lines since the BTree database will sort the exported entries so that the V4: key will appear last and then mountd will complain when scanning the DB. 
> 
> Also when using the file /etc/exports you can either write “V4:/export -sec=sys” or “V4: /export -sec=sys” (with a space between V4: and the path) so this probably needs some special handling (can’t simply use “makemap -f btree /etc/exports.db </etc/exports” to create it. 
> 
> I did try to fix that but the code quickly got hairy so I didn’t like it. If we really want/need that I’m thinking of creating a special case for the V4: handling, sort of like prefixing the database key with a NUL byte or something (so that it will be sorted first). 
> 
> 
> Multiline options - well, the current ZFS code doesn’t support it either so no change from the current setup but it would be nice to have.
>  Ie, support for things like:
>     /export -sec=krb5
>     /export -sec=sys ro
> 
> 
> 
> <mountd-sharedb.patch><libshare-nfs-sharedb.patch><share.c>
> 
> 
> (I also agree that the USE_SHAREDB probably should be removed, I just have that here for now so that I can quickly disable the code)
> 
> Regarding the patch to the zfs part - I’m not sure which way to go there - the current patch switches to always use the DB. But one could argue that the code could check for an existing /etc/zfs/exports.db and only use the DB-writing code if that already exists. That way it will support both the old way and the new way, but it requires an empty /etc/zfs/exports.db to be pre-created at initial boot time for it to start using it. 
> 
> - Peter
> 
> 
>> On 21 Jul 2023, at 00:50, Rick Macklem <rick.macklem@gmail.com> wrote:
>> 
>> Hi,
>> 
>> Peter Eriksson has submitted an interesting patch that adds support
>> for a database file for exports.  My understanding is that this improves
>> performance when used with the ZFS share property for exporting a
>> large number of file systems.
>> 
>> There are a couple of user visible issues that I'd like feedback from
>> others. (Hopefully Peter can correct me if I get any of these wrong.)
>> 
>> - The patch uses a single database file and a new "-D" option to
>> specify it on the command line.
>> --> I think it might be less confusing to just put the database file(s)
>>       in the exports list and identify them via a ".db" filename suffix.
>> What do others think?
>> 
>> The changes are #ifdef'd on USE_SHAREDB. I'm thinking that this
>> change will be always built, so maybe USE_SHAREDB is not needed?
>> (Obviously mountd(8)'s semantics will only changed if/when database
>> file(s) are provided.)
>> 
>> Once I have feedback on the above, I will put a patch up on
>> phabricator.
>> 
>> rick
>> ps: I believe kevans@ has volunteered to shepperd the ZFS share
>>     property changes.
>> 
>