[Bug 269883] net/samba416: macOS Time Machine backups broken after contrib/tzcode update in 13.2

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 28 Feb 2023 20:34:49 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=269883

            Bug ID: 269883
           Summary: net/samba416: macOS Time Machine backups broken after
                    contrib/tzcode update in 13.2
           Product: Ports & Packages
           Version: Latest
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: Individual Port(s)
          Assignee: timur@FreeBSD.org
          Reporter: dim@FreeBSD.org
          Assignee: timur@FreeBSD.org
             Flags: maintainer-feedback?(timur@FreeBSD.org)

TL;DR: the "fruit:zero_file_id" setting should be "yes" by default and we
should apply upstream Samba patches for this, otherwise existing Time Machine
backups over SMB can get messed up.

Long story:

I recently encounted problems with Apple's Time Machine backing up to a FreeBSD
server with samba416-4.16.8 installed. These problems started after I upgraded
the base system from 13.1-STABLE (of ~3 months ago) to 13.2-STABLE (as of a few
days ago), and then rebuilding all my ports, including the samba416-4.16.8
package.

The problems initially showed as hanging or aborting Time Machine backups, and
if you would attempt a "verify backups" action, it would run fsck_apfs on the
disk image mounted over SMB, which then showed errors similar to:

/dev/disk5s1: fsck_apfs started at Mon Feb 27 00:20:36 2023
/dev/disk5s1: ** Checking the container superblock.
/dev/disk5s1:    Checking the checkpoint with transaction ID 2199286.
/dev/disk5s1: ** Checking the space manager.
/dev/disk5s1: ** Checking the space manager free queue trees.
/dev/disk5s1: ** Checking the object map.
/dev/disk5s1: ** Checking volume /dev/rdisk5s1.
/dev/disk5s1: ** Checking the APFS volume superblock.
/dev/disk5s1:    The volume Backups of mac was formatted by newfs_apfs
(1677.81.1) and last modified by apfs_kext (2142.81.1).
/dev/disk5s1: ** Checking the object map.
/dev/disk5s1: warning: (oid 0x2126b29c) om: btn: invalid o_cksum
(0x1700608e55352f42)
/dev/disk5s1:    Object map is invalid.
/dev/disk5s1: ** The volume /dev/rdisk5s1 was found to be corrupt and cannot be
repaired.
/dev/disk5s1: ** Verifying allocated space.
/dev/disk5s1: ** The volume /dev/disk5s1 could not be verified completely.
/dev/disk5s1: fsck_apfs completed at Mon Feb 27 00:20:44 2023

Even if you would rollback the share (with zfs rollback) to a known-good state,
i.e. which had successfully verified OK in the past, it would *still* get
fsck_apfs errors like above.

However, when I reinstalled the samba416-4.16.8 package from the old poudriere
packages directory, which had been built with 13.1-STABLE, it all worked fine
again, and full fsck_apfs runs were completely OK!

So what was the cause for the difference, even if the port versions were
exactly the same? It turned out to be quite a deep rabbit hole!

After a *lot* of experimentation, swapping back .so files from "good" and "bad"
packages, and even going so far as to swap .o files from "good" and "bad"
builds and re-linking them, I found that the culprit was in libsamba-util.so.0,
specifically the lib/util/time.c.26.o file.

This file got compiled differently on 13.2-STABLE than on 13.1-STABLE:

On 13.2-STABLE, the TIME_T_MAX define would have been set by the configure
script, to the value 67768036191676799ll.

On 13.1-STABLE, the TIME_T_MAX define would *not* have been set by the
configure script, and time.h would then define it as:

  #define TIME_T_MAX MIN(INT32_MAX,_TYPE_MAXIMUM(time_t))

which effectively becomes INT32_MAX, i.e. 0x7fffffff.

This was also visible in one the changed lines in the build logs (I had logs
from both the poudriere run with 13.1 world, and with 13.2 world):

1606c1606
< Checking for the maximum value of the 'time_t' type                          
                  : not found 
---
> Checking for the maximum value of the 'time_t' type                                             : ok 

So on 13.1 it could not find the maximum value, while on 13.2 it could. The
reason for this is a recent contrib/tzcode update,
<https://cgit.freebsd.org/src/commit/?id=93cc70bf9ca7>, which now makes
gmtime(0x7fffffffffffffffll) fail, whereas it succeeded before. Samba uses this
check in its configure script.

It turns out that Samba uses this TIME_T_MAX value in all kinds of places, but
most importantly (in some cases) it used to generate SMB file IDs! If
TIME_T_MAX is a different value, some files might get completely different file
ID numbers, and apparently this greatly confuses the Apple SMB client.

The code deriving file IDs from timestamps was added for Samba bug 14928 in
<https://git.samba.org/?p=samba.git;a=commitdiff;h=23fbf0bad03>, around the
Samba 4.15.4 release.

But later, after talking to Apple people, they have ripped out this whole thing
again, in <https://git.samba.org/?p=samba.git;a=commitdiff;h=643da37fd13>:

    smbd: remove itime and file_id logic and code

    This bases File-Ids on the inode numbers again. The whole stuff was
    added because at that time Apple clients

    1. would be upset by inode number reusage and

    2. had a client side bug in their fallback implemetentation that
    assigns File-Ids on the client side in case the server provides
    File-Ids of 0.

    After discussion with folks at Apple it should be safe these days to
    rely on the Mac to generate its own File-Ids and let Samba return 0
    File-Ids.

and its follow-up,
<https://git.samba.org/?p=samba.git;a=commitdiff;h=24f4bea5b8e>:

    vfs_fruit: change default for "fruit:zero_file_id" option to yes

    After discussion with folks at Apple it should be safe these days to rely
on the
    Mac to generate its own File-Ids and let Samba return 0 File-Ids.

    Signed-off-by: Ralph Boehme <slow@samba.org>
    Reviewed-by: Jeremy Allison <jra@samba.org>

For now, if anybody encounters this bug with Apple's Time Machine, you should
work around it by setting "fruit:zero_file_id = yes" in your smb4.conf, either
in the [global] section, or in the specific shares for Time Machine (i.e. those
with "fruit:time machine = yes").

But it would be nice if we could import the two above Samba commits:

  <https://git.samba.org/?p=samba.git;a=commit;h=643da37fd13>
  <https://git.samba.org/?p=samba.git;a=commit;h=24f4bea5b8e>

because that seems a lot safer. At the least the last one, which is trivial
because it only sets the default for "fruit:zero_file_id" to "yes".

-- 
You are receiving this mail because:
You are the assignee for the bug.