[Bug 267628] growfs deadlocks if output is redirected to the filesystem being grown

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 07 Nov 2022 22:01:27 UTC

            Bug ID: 267628
           Summary: growfs deadlocks if output is redirected to the
                    filesystem being grown
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: bin
          Assignee: bugs@FreeBSD.org
          Reporter: cgull@glup.org

growfs deadlocks in disk wait on 'suspfs' if its stdout/stderr is redirected to
a file on the filesystem being grown.  This appears to be because it suspends
the filesystem with the UFSSUSPEND ioctl, and then writes disk block numbers as
a progress indicator as it writes new cylinder groups.

The reproduction is easy:

mdconfig -s 20m
gpart create -s gpt md0
gpart add -t freebsd-ufs -s 10m -i 1 md0
newfs md0p1
mount /dev/md0p1 /mnt
gpart resize -i 1 md0
growfs -y md0p1 > /mnt/growfs.log

Reproduced in 13.1p2 and a -CURRENT snapshot dated Nov 3 07:57 with identifier

Also, if the system is shutdown with a growfs in this state, disk syncing hangs
and the system never halts/reboots.  I'll open a separate bug for that.

Ideas for fixes (some are not complete fixes):

* fstat() stdout/stderr and compare their fsids against the filesystem we're
growing.  If either is found to match, exit with an error message.  I think
that may be a complete solution for growfs (it can no longer uninterruptibly
deadlock).  'growfs | tee /growing-fs/growfs.log' will still hang if growfs
produces enough output, but growfs will be interruptible, and tee will make
progress after the ufssuspend state is exited.

* Write new cylinder groups first, before doing UFSSUSPEND and writing the
formerly-last cg and superblock.  This is not a complete fix for this
particular issue (if growfs reports errors it will still deadlock), but is 90%,
and has the added benefits of doing most of the I/O (and encountering most
possible errors) before making irreversible changes to the existing filesystem,
and also reducing the amount of time the filesystem is suspended.

* Don't report progress or errors while in ufssuspend state.

* On non-ttys, set stdio buffers so that progress indicators are not written
until completion.

I'll try and come up with patches for this.

