Re: cpdup fails silently on automounted NFS src dir

From: Kevin Oberman <rkoberman_at_gmail.com>
Date: Mon, 07 Apr 2025 06:03:28 UTC
On Sun, Apr 6, 2025 at 11:53 AM G. Paul Ziemba <pz-freebsd-stable@ziemba.us>
wrote:

> Summary: interaction between autounmountd and cpdup's mount-point-traversal
> detection truncates tree copies early without error.
>
> I'm running 14-stable and am seeing this both on:
>
> - 14.0-STABLE built from sources of 27 Mar 2024 and also on
> - 14.2-STABLE built from sources of 3 Apr 2025.
>
> There doesn't seem to be anything specific to 14-stable so I'll bet
> this issue also manifests on earlier versions of FreeBSD.
>
> I think I understand what's happening (details below), but I'm
> not sure about the right way to fix it.
>
> Scenario
>
>     A large file tree (in my case, the FreeBSD source tree) is published
>     on an NFS server.
>
>     A FreeBSD NFS client automounts a volume containing this
>     large file tree.
>
>     cpdup attempts to copy the file tree to another location (in my
>     case, that happens to be another NFS filesystem, but I don't think
>     it matters).
>
>     cpdup completes without error, however, the destination directory
>     is incomplete, with many empty directories.
>
> Analysis
>
>     cpdup examines the device ID (st_dev) returned by stat(2) as it
>     traverses the source and destination trees copying directories
>     and files. When it finds an st_dev value different from the initial
>     value at the top of the respective tree, it concludes that it has
>     crossed a mount point and prunes the copy at that point.
>
>     I instrumented cpdup with some additional logging to examine its
>     notion of the src and dst st_dev values and found that, in my
>     test case, in the middle of its tree copy, cpdup started getting
>     unexpected new values of st_dev for the src tree and skipping
>     all directories after that.
>
> --- src/cpdup.c.orig    2025-04-04 15:04:44.623646000 -0700
> +++ src/cpdup.c 2025-04-05 15:10:52.779426000 -0700
> @@ -947,10 +947,15 @@
>          * When copying a directory, stop if the source crosses a mount
>          * point.
>          */
> -       if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo)
> +       if (VerboseOpt >= 2)
> +           logstd("sdevNo: %ld, stat1->st_dev: %ld\n", sdevNo,
> stat1->st_dev);
> +       if (sdevNo != (dev_t)-1 && stat1->st_dev != sdevNo) {
> +           if (VerboseOpt >= 2)
> +               logstd("setting skipdir due to sdevNo != stat1->st_dev\n");
>             skipdir = 1;
> -       else
> +       } else {
>             sdevNo = stat1->st_dev;
> +       }
>
>     I eventually looked at the automounter and added some logging via
>     devd.conf:
>
>     notify 10 {
>             match "system"          "VFS";
>             match "subsystem"       "FS";
>             action "logger VFS FS msg=$*";
>     };
>
>     And saw the following in /var/log/messages:
>
> Apr  6 10:39:31 f14s-240327-portbuilder me[58694]: VFS FS msg=!system=VFS
> subsystem=FS type=MOUNT mount-point="/s/public"
> mount-dev="hairball:/v2/Source/public" mount-type="nfs"
> fsid=0x94ff003a3a000000 owner=0 flags="automounted;"
> Apr  6 10:49:54 f14s-240327-portbuilder me[58761]: VFS FS msg=!system=VFS
> subsystem=FS type=UNMOUNT mount-point="/s/public"
> mount-dev="hairball:/v2/Source/public" mount-type="nfs"
> fsid=0x94ff003a3a000000 owner=0 flags="automounted;"
> Apr  6 10:49:54 f14s-240327-portbuilder me[58770]: VFS FS msg=!system=VFS
> subsystem=FS type=MOUNT mount-point="/s/public"
> mount-dev="hairball:/v2/Source/public" mount-type="nfs"
> fsid=0x95ff003a3a000000 owner=0 flags="automounted;"
>
>     (By the way, st_dev reported by my new cpdup log messages was a
>     rearranged version of "fsid" in the devd messages)
>
>     Note that after ten minutes, the NFS filesystem is unmounted and then
>     immediately remounted.
>
>     The source code of /usr/sbin/autounmountd indicates that it
>     attempts to unmount automounted filesystems ten minutes after
>     they have been mounted (modulo some sleep-related jitter).
>

Just a minor correction: the volume is unmounted 10 minutes after the
reference count
of the volume reaches zero.You can keep the volume mounted indefinitely by
any action
that keep the reference count >0 by doing "cd /media/sutomounted_device" or
"ls /media/auomounted_device" every 8 minutes. Also, one of the
configuration files can
adjust this timeout. (Sorry, but I don't recall which one.)
-- 
Kevin Oberman, Part time kid herder and retired Network Engineer
E-mail: rkoberman@gmail.com
PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683