Re: iSCSI target: Handling in-flight requests during ctld shutdown

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Thu, 30 Dec 2021 22:41:25 UTC
On 12/30/21 12:10 PM, Alexander Motin wrote:
> On 30.12.2021 14:29, John Baldwin wrote:
>> On 12/29/21 1:57 PM, Alexander Motin wrote:
>>> The HARDWARE ERROR is obviously not expected by the initiator.  It
>>> should better not be leaked after we decided to kill the connection.
>>> Initiator may retry it and still work happily after reconnect, but
>>> cleaner would be to not rely on that.  cfiscsi_session_terminate_tasks()
>>> aborts all running commands by CTL_TASK_I_T_NEXUS_RESET, that make them
>>> not return statuses to initiator, but I suppose this is the other side
>>> of the race now.
>>
>> Hmm, I wonder if we should be setting CTL_FLAG_ABORT instead of setting the
>> port_status when aborting an I/O?  The comment in ctl_frontend_iscsi.c
>> claims
>> the backends check the port_status, but I don't see any checks for
>> port_status
>> at all in backends.
> 
> The checks were in all backends, until in 2c7dc6bae9fd I've moved them
> to ctl_datamove_done_process().  Backends don't have any special
> knowledge now to process specific data move error codes.  It was
> complete code duplication between backends and I've tried to organize
> that a bit.  Unfortunately we do not have formalized data move error
> statuses in CTL to do more.  It would be much more interesting (and
> complicated ;)) if iSCSI (for example) could report data transfer error
> without killing connection, but that area in general is not very
> specified in SCSI, since usually each transport has own means of recovery.

Hmmm, but this commit seems to have actually broken this case.  That is,
now the possibly uninitialized data is written to the backing store:

         /*
-        * We set status at this point for read commands, and write
-        * commands with errors.
+        * We set status at this point for read and compare commands.
          */
-       if (io->io_hdr.flags & CTL_FLAG_ABORT) {
-               ;
-       } else if ((io->io_hdr.port_status != 0) &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
-            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
-               ctl_set_internal_failure(&io->scsiio, /*sks_valid*/ 1,
-                   /*retry_count*/ io->io_hdr.port_status);
-       } else if (io->scsiio.kern_data_resid != 0 &&
-           (io->io_hdr.flags & CTL_FLAG_DATA_MASK) == CTL_FLAG_DATA_OUT &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE ||
-            (io->io_hdr.status & CTL_STATUS_MASK) == CTL_SUCCESS)) {
-               ctl_set_invalid_field_ciu(&io->scsiio);
-       } else if ((io->io_hdr.port_status == 0) &&
-           ((io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE)) {
+       if ((io->io_hdr.flags & CTL_FLAG_ABORT) == 0 &&
+           (io->io_hdr.status & CTL_STATUS_MASK) == CTL_STATUS_NONE) {
                 lbalen = ARGS(beio->io);
                 if (lbalen->flags & CTL_LLF_READ) {
                         ctl_set_success(&io->scsiio);

As ctl_datamove_done_process() will call ctl_set_internal_failure(), but
ctl_datamove_done() will still call be_move_done which runs the code above.
The code above doesn't check port_status anywhere, so it will still happily
write the stale data to the backend unless CTL_FLAG_ABORT is set.  Do the
backends need to be checking port_status == 0 again in addition to
CTL_FLAG_ABORT then?

>>    I do see checks for CTL_FLAG_ABORT, and the handler
>> for
>> the CTL_TASK_I_T_NEXUS_RESET does set CTL_FLAG_ABORT on pending requests.
>>
>> For the tasks in sciscsi_session_terminate_tasks(), those should already
>> have
>> CTL_FLAG_ABORT set anyway, but it wouldn't hurt if it were set again by
>> cfiscsi_data_wait_abort().  For the the cfiscsi_task_management_done
>> case I'm
>> less certain, but I suspect there too that returning an internal error
>> status
>> back to the initiator is not expected and that it would be better to
>> just set
>> CTL_FLAG_ABORT and drop any response?
> 
> cfiscsi_data_wait_abort() does only one specific thing -- it aborts
> waiting for data and completes the data move operation, allowing backend
> to continue (or at least free resources, etc).  It is only
> cfiscsi_session_terminate_tasks() knows that it should abort data moves
> while it aborts tasks, since it knows that the connection is dead.
> cfiscsi_task_management_done() is another special case, since iSCSI
> explicitly mentions that there will be no further exchanges related to
> that task, so the data move should explicitly return.  But in general
> case I think it would be nice to not combine those two facts of aborting
> task and aborting data transfer, since the last may not necessary imply
> the first, as the first does not imply the last.
> 
> It is tempting to just set the flag, but I suspect it may cause more
> problems (I am worrying about HA peer update).  I'd prefer the race
> inside the transport to be handled inside the transport.  May be we
> could just add another flag to be set under the session lock inside
> cfiscsi_session_terminate_tasks() after it aborted all the tasks, but
> before it start to abort transfers itself and use it inside
> cfiscsi_datamove_out() instead of cs_terminating.

I think though that even that flag wouldn't quite help as you would still
have the problem that the internal_error response would still be sent on
the wire if we don't abort the entire task vs aborting just the move.

It may be that all I really need to do is fix the one case in
cfiscsi_datamove_out() to set CTL_FLAG_ABORT rather than changing
cfiscsi_data_move_abort()?

-- 
John Baldwin