[Bug 218572] pass(4) driver sometimes does error recovery when CAM_PASS_ERR_RECOVER is not set

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Tue Apr 11 21:27:59 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218572

            Bug ID: 218572
           Summary: pass(4) driver sometimes does error recovery when
                    CAM_PASS_ERR_RECOVER is not set
           Product: Base System
           Version: 10.3-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: terry-freebsd at glaver.org

[This is a summation of a long discussion between me, ken@ and mav@]

After SVN rev 236814 in FreeBSD/head, the pass(4) driver does some error 
recovery, but not all cases, when the retry_count is set in the CCB and
CAM_PASS_ERR_RECOVER is not set.

Previously, the pass(4) driver would only do error recovery if
CAM_PASS_ERR_RECOVER is set.

This can be seen with 'camcontrol tur -v'.  camcontrol sets the retry_count
to 1 by default, so that the user will have at least one retry if he turns
on retries with -E.

If you reset a hard drive:

# camcontrol reset 1:172:0
Reset of 1:172:0 was successful

There should be a Unit Attention pending:

# camcontrol tur 1:172:0 -v
Unit is ready

But that doesn't happen, because the kernel is doing error recovery when
we have not turned it on with -E (which sets the CAM_PASS_ERR_RECOVER flag
on the CCB).

Retrying the experiment:

# camcontrol reset 1:172:0
Reset of 1:172:0 was successful

Now set the retry count to 0:

# camcontrol tur 1:172:0 -v -C 0
Unit is not ready
(pass42:mps1:0:172:0): TEST UNIT READY. CDB: 00 00 00 00 00 00 
(pass42:mps1:0:172:0): CAM status: SCSI Status Error
(pass42:mps1:0:172:0): SCSI status: Check Condition
(pass42:mps1:0:172:0): SCSI sense: UNIT ATTENTION asc:29,2 (SCSI bus reset
occurred)
(pass42:mps1:0:172:0): Field Replaceable Unit: 2

We get the unit attention.

Also, the "Filemark detected" asc/ascq entry (0x00,0x01) and other, similar
tape error recovery entries should probably have an error recovery action
of SS_NOP instead of SS_RDEF.  The application should be notified of 
Filemarks, setmarks, end of medium, etc.

[This affects everything after r237326 in 9-STABLE, so the affected releases
are 9.1/2/3, 10.0/1/2/3, 11.0, HEAD. As everything before 10.3 is EoL, the fix
only needs to be MFC'd back to 10.3.]

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list