ZFS, Vnode cache, and poor directory listing performance via Samba

Dave Baukus daveb at spectralogic.com
Thu Mar 29 14:38:44 UTC 2018


Thank you for the explanation and suggestions Conrad.
Unfortunately this absurd directory is at a customer site generated by some ill-designed application.

Dave Baukus

On 03/28/2018 08:35 PM, Conrad Meyer wrote:
> Hi Dave,
>
> Full scans are the worst case for an LRU cache.  In particular, you
> are full-scanning an *extremely* large directory, which evicts your
> entire vnode cache.  Then you suffer the (presumably) entirely
> serialized penalty of refetching every single inode from disk again
> after the first scan.
>
> Here are some solutions in order of preference:
> 1. Organize your files better.  1 million in a single directory is
> absurd.  Can windows explorer meaningfully navigate a 1mil file
> directory?  I doubt it.
> 2. Continue to bump maxvnodes to compensate for poor file organization
> + naive clients doing full scans.
> 3. Enhance samba to signal something like DONTNEED on
> "SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *" requests to the OS.
> 3.a. Enhance samba to parallelize or otherwise asynchronously process
> the above requests on huge directories (improve the uncached case).
>
> I don't think this has much to do with ZFS, other than that ZFS
> performance on your hardware appears to be quite bad without the VFS
> cache sitting in front to absorb most of the requests.
>
> Best,
> Conrad
>
>
> On Wed, Mar 28, 2018 at 7:07 PM, Dave Baukus <daveb at spectralogic.com> wrote:
>> Below is narrative angst and woe for which I have the the following observations/questions:
>>
>> - Increasing kern.maxvnodes from 600,000 to 2,000,000 apparently solves the "problem"
>> - This decreases the number of lookups in the scenario below from 40719 (some of which take over a second) to 4
>> - 2,000,00 may be extreme, but I was hoping for an authoritative comment on why/how this improves the scenario and
>>     then perhaps I can come up with some reasonable tuning options.
>> - is this an artifact of the Freebsd 11-ish refactoring of the ZFS/Freebsd VNOP interface (?)
>>
>> -----------------------------------------------------
>> I have the following scenario on FreeBSD Stable 11.0:
>>
>> A ZFS with a directory containing 1,000,000 files; the root of this ZFS is
>> exported via SAMBA using NFSv4 ACL plugin and DOS attributes with the (<get|set>extattr) implementation.
>>
>> A local full listing of this directory (ls -l > /dev/null) completes in about 40 seconds.
>> A full listing from a Samba client (ls -l) completes in about 3 minutes.
>>
>> Using windows explorer from a Win2008 client is where the strangeness begins; it
>> takes between 8 to 12 minutes before control is returned to win-explorer.
>>
>> Tracing this with wireshark I noticed that "SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *"
>> requests from the Win2008 client start off functioning well (client requests
>> 64k of data and samba responds with 64k of directory data). After about 150 seconds of this
>> interaction the client makes a "SMB2_FIND_ID_BOTH_DIRECTORY_INFO Pattern: *" request that is not
>> responded to for over 60 seconds. The windows client closes the connection, starts a new
>> connection, and begins directory listing from ground zero. This pattern continues for
>> 6 to 10 minutes; I never see final request/response where the server indicates that the
>> listing is complete; I believe win-explorer just gives up.
>>
>> Meanwhile, back on FreeBSD/ZFS I'm running a dtrace script that times the following
>> ZFS VNOPs for the connected Samba server instance:
>>
>> - fbt:zfs:zfs_*extattr:entry and return (get|set|delete|list)extattr
>> - fbt:zfs:zfs_freebsd_lookup:entry and return
>> - fbt:zfs:zfs_freebsd_readdir:entry and return
>> - fbt:zfs:zfs_freebsd_getattr:entry and return
>>
>> This starts off looking like:
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 19931
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3975
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2662
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1711
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1768
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1411
>>    12  27787       zfs_freebsd_readdir:return zfs_freebsd_readdir :: 44325
>>    12  27787       zfs_freebsd_readdir:return zfs_freebsd_readdir :: 38054
>>    12  27787       zfs_freebsd_readdir:return zfs_freebsd_readdir :: 36137
>> ...
>> ... line 11,800
>>    16  27763        zfs_freebsd_getacl:return zfs_freebsd_getacl :: 2709
>>    16  27763        zfs_freebsd_getacl:return zfs_freebsd_getacl :: 2046
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2238
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1452
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1570
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1608
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1571
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1431
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1431
>>    16  27763        zfs_freebsd_getacl:return zfs_freebsd_getacl :: 2856
>>    16  27763        zfs_freebsd_getacl:return zfs_freebsd_getacl :: 1907
>>    16  27809            zfs_getextattr:return zfs_getextattr :: 3537
>>    16  27787       zfs_freebsd_readdir:return zfs_freebsd_readdir :: 45135
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2744
>>    16  27809            zfs_getextattr:return zfs_getextattr :: 3221
>>    16  27811           zfs_listextattr:return zfs_listextattr :: 3762
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2090
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2214
>>    16  27809            zfs_getextattr:return zfs_getextattr :: 20112
>>    16  27809            zfs_getextattr:return zfs_getextattr :: 14989
>>    16  27787       zfs_freebsd_readdir:return zfs_freebsd_readdir :: 35946
>>    16  27811           zfs_listextattr:return zfs_listextattr :: 46900
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2115
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1439
>>    16  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 22886
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1449
>>    16  27809            zfs_getextattr:return zfs_getextattr :: 4046
>>    16  27811           zfs_listextattr:return zfs_listextattr :: 2239
>>    16  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 15128
>>    16  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1640
>> ...
>> ... line 175,000
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 85760734
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3617
>>    12  27809            zfs_getextattr:return zfs_getextattr :: 14064
>>    12  27811           zfs_listextattr:return zfs_listextattr :: 4088
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 85586541
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2983
>>    12  27809            zfs_getextattr:return zfs_getextattr :: 11416
>>    12  27811           zfs_listextattr:return zfs_listextattr :: 3230
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 85758027
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3124
>> ...
>> ... line 176,0000
>>     1  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1113397903
>>     1  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3189
>>     1  27809            zfs_getextattr:return zfs_getextattr :: 6423
>>     1  27811           zfs_listextattr:return zfs_listextattr :: 3090
>>     1  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1108181740
>>     1  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3267
>>     1  27809            zfs_getextattr:return zfs_getextattr :: 5486
>>     1  27811           zfs_listextattr:return zfs_listextattr :: 3111
>>     1  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1092061756
>>     1  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3113
>>     1  27809            zfs_getextattr:return zfs_getextattr :: 5691
>>     1  27811           zfs_listextattr:return zfs_listextattr :: 3073
>>     1  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1102236755
>>     1  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3435
>>     1  27809            zfs_getextattr:return zfs_getextattr :: 5862
>>     1  27811           zfs_listextattr:return zfs_listextattr :: 3771
>>     1  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1101668231
>>     1  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3189
>>     1  27809            zfs_getextattr:return zfs_getextattr :: 6671
>>    15  27811           zfs_listextattr:return zfs_listextattr :: 12951
>>    15  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1061648117
>>    15  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 5365
>>    15  27809            zfs_getextattr:return zfs_getextattr :: 5731
>>    21  27811           zfs_listextattr:return zfs_listextattr :: 8178
>>    21  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 64429430
>>    21  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2912
>>    21  27809            zfs_getextattr:return zfs_getextattr :: 5566
>>    21  27811           zfs_listextattr:return zfs_listextattr :: 2454
>>    19  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1017176234
>>    19  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 2976
>>    19  27809            zfs_getextattr:return zfs_getextattr :: 6230
>>    19  27811           zfs_listextattr:return zfs_listextattr :: 2710
>>    19  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 64211015
>>    19  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1876
>>    19  27809            zfs_getextattr:return zfs_getextattr :: 3690
>>    19  27811           zfs_listextattr:return zfs_listextattr :: 2292
>>    19  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 17007
>>    19  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1766
>>    19  27809            zfs_getextattr:return zfs_getextattr :: 3357
>>    19  27811           zfs_listextattr:return zfs_listextattr :: 2331
>>    19  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 63817436
>>    19  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1827
>>    19  27809            zfs_getextattr:return zfs_getextattr :: 12231
>>    12  27811           zfs_listextattr:return zfs_listextattr :: 8658
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 64859702
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 3296
>>    12  27809            zfs_getextattr:return zfs_getextattr :: 6118
>>    12  27811           zfs_listextattr:return zfs_listextattr :: 2454
>>    12  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 17442
>>    12  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 1676
>>    12  27809            zfs_getextattr:return zfs_getextattr :: 3649
>>    12  27811           zfs_listextattr:return zfs_listextattr :: 2363
>>     0  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1013471141
>>     0  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 5995
>>     0  27809            zfs_getextattr:return zfs_getextattr :: 9280
>>     0  27811           zfs_listextattr:return zfs_listextattr :: 3219
>>     0  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 64286196
>>     0  27765       zfs_freebsd_getattr:return zfs_freebsd_getattr :: 5618
>>     0  27809            zfs_getextattr:return zfs_getextattr :: 8919
>>     0  27811           zfs_listextattr:return zfs_listextattr :: 3117
>>    13  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 999431953
>>    13  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1062322808
>>     9  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 1061885578
>>     9  27777        zfs_freebsd_lookup:return zfs_freebsd_lookup :: 11283
>>
>> At this point the client closes the connection and the connected, samba server process exits.
>>
>> After increasing the vnodes to 2M, the wire transfer of the directoy listing completes
>> in about 60 seconds with the final "no more files" response status observed,
>> and win-explorer cogitates on the data for about another 2 minutes
>> before control is returned to win-explorer.
>>
>> Thanks for any feed back.
>>
>> --
>> Dave Baukus
>> _______________________________________________
>> freebsd-fs at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> .
>


More information about the freebsd-fs mailing list