Filesystem operations slower in 13.0 than 12.2
Christos Chatzaras
chris at cretaforce.gr
Sat Mar 6 06:37:34 UTC 2021
Hello Konstantin,
> On 6 Mar 2021, at 01:12, Konstantin Belousov <kostikbel at gmail.com> wrote:
>
> There was (is) bugs in FreeBSD UFS SU < 13
> - some LoR existed in SU code, where it needed to lock a containing directory
> to provide posix guarantees for fsync(), while owning the vnode lock. I
> do not believe it is observable in a real-world uses
If you are talking about these changes:
https://svnweb.freebsd.org/base?view=revision&revision=367672 <https://svnweb.freebsd.org/base?view=revision&revision=367672>
then only during doing Prestashop translations, and after clicking on "Save" it removes and recreates Prestashop cache in /var/cache/prod directory could trigger a "processes hanging in ufs state". I use FreeBSD since 6.x and it was the first time I could trigger it (maybe it's related to specific Prestashop version too).
> - in some situations UFS SU in < 13 did not performed necessary fsync()
> of the directory, related to the previous item
> The end result was that after sucessfull fsync() followed by a system
> failure e.g. power or panic, the parent directory for the synced
> vnode would not be synced and the vnode dirent' is not written to the
> permanent store. This volatiles posix requirement that after fsync, the
> data can be read, since you plain cannot open the file.
>
> During the development of the patch to fix both LoR and related
> ommission of fsync, a mistake was made resulting in much more aggessive
> syncing of directories. It was not exactly that, but approximately, on
> most of metadata operations that created or removed directory entry,
> the directory was fully synced. This resulted in the significant slow
> down, which was eliminated around BETA4..RC1. I.e. most of fixes come to
> BETA4, but minor parts were only discovered later and ready for RC1.
I ask these questions to better understand how a FreeBSD developer works (and more specifically when a bug is not reported).
1) How you discover about this LoR / fsync ommission bug? Someone else found it and report it (I couldn't find a PR for this)? Is it discovered by a test suite? You found it by doing other work in this part of the code?
2) When I report the slowdown with BETA2 few weeks ago, you replied that this is a known bug and it will be fixed in BETA3 or BETA4.
After the initial patches that made more aggessive syncing of directories, how did you discover the slowdown?
> There are still more fsync(dir) in 13RC1 than it is in any 12, by the nature
> of the bug and its fix, but the current belief is that all fsync calls left
> in the flow are required for correctness.
Thank you for explaining these changes.
More information about the freebsd-stable
mailing list