[Bug 271460] ctld ports become inaccessible due to concurrent service restarts

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 16 May 2023 23:06:51 UTC

            Bug ID: 271460
           Summary: ctld ports become inaccessible due to concurrent
                    service restarts
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: asomers@FreeBSD.org

Created attachment 242225
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=242225&action=edit
Example ctl configuration file

If two separate processes do "service ctld restart", then they can race.  The
result is ctl ports that are inaccessible (clients can't connect), and the
ports don't get torn down after ctld exits.  Attempting to start ctld again
fails to fix the stuck ports (though new ports can be added).  The only remedy
is to restart.

Steps to reproduce
1) Create about 32 zvols (i've also observed this bug with file-backed LUNs)
2) Configure /etc/ctl.conf as shown in the attached file
3) Run the following in two separate terminals:
for ((i=0; i<10000; i=$i+1)); do  service ctld onerestart|| break; done

After some time, usually < 1 second, one terminal will fail with an error like
ctld: LUN modification error: LUN 31 is not managed by the block backend
ctld: failed to modify lun "disk31", CTL lun 31
ctld: CTL_LUN_MAP ioctl failed: Device not configured
ctld: failed to apply configuration; exiting
/etc/rc.d/ctld: WARNING: failed to start ctld

Then, kill the loop in the other terminal.  Then ensure that no ctld process is
running, and do "ctladm portlist".  All 32 ports will be shown.  Attempting to
start ctld one more time will result in an error like this:

ctld: error returned from port creation request: target
"iqn.2018-10.myhost:disk0" for portal group tag 257 already exists
ctld: failed to update port pg0-iqn.2018-10.myhost:disk0

You are receiving this mail because:
You are the assignee for the bug.