svn commit: r351550 - head/sys/cam/scsi

Alexander Motin mav at FreeBSD.org
Tue Aug 27 20:57:57 UTC 2019


Some FreeNAS user reported panic after updating to newer version.  On
the screenshot provided were several BUSY statuses for SATA disk on
mps(4), followed by panic "Attempt to remove out-of-bounds index -1 from
queue ...".  In his case I blame ancient LSI firmware or some broken
hardware, but I was able to reproduce the panic on FreeBSD head debug
kernel by hacking mps(4) driver to always report BUSY (appeared except
IDENTIFY and REPORT LUNS).  To diagnose it I inserted assertion into
xpt_free_ccb(), checking ccb->ccb_h.pinfo.index for values used for
requests still in send queue.  Not sure it is to be persistent, but in
this case it lead me directly to this place.

On 27.08.2019 16:23, Scott Long wrote:
> This is very concerning, and I wonder if it’s the cause of the mystery use-after-free / double-complete that I’ve seen for years and have never been able to catch.  Can you say more about how you found it?
> 
> Scott
> 
> 
>> On Aug 27, 2019, at 10:41 AM, Alexander Motin <mav at FreeBSD.org> wrote:
>>
>> Author: mav
>> Date: Tue Aug 27 16:41:06 2019
>> New Revision: 351550
>> URL: https://svnweb.freebsd.org/changeset/base/351550
>>
>> Log:
>>  Always check cam_periph_error() status for ERESTART.
>>
>>  Even if we do not expect retries, we better be sure, since otherwise it
>>  may result in use after free kernel panic.  I've noticed that it retries
>>  SCSI_STATUS_BUSY even with SF_NO_RECOVERY | SF_NO_RETRY.
>>
>>  MFC after:	1 week
>>  Sponsored by:	iXsystems, Inc.
>>
>> Modified:
>>  head/sys/cam/scsi/scsi_xpt.c
>>
>> Modified: head/sys/cam/scsi/scsi_xpt.c
>> ==============================================================================
>> --- head/sys/cam/scsi/scsi_xpt.c	Tue Aug 27 15:42:08 2019	(r351549)
>> +++ head/sys/cam/scsi/scsi_xpt.c	Tue Aug 27 16:41:06 2019	(r351550)
>> @@ -1684,8 +1684,9 @@ probe_device_check:
>> 	case PROBE_TUR_FOR_NEGOTIATION:
>> 	case PROBE_DV_EXIT:
>> 		if (cam_ccb_status(done_ccb) != CAM_REQ_CMP) {
>> -			cam_periph_error(done_ccb, 0,
>> -			    SF_NO_PRINT | SF_NO_RECOVERY | SF_NO_RETRY);
>> +			if (cam_periph_error(done_ccb, 0, SF_NO_PRINT |
>> +			    SF_NO_RECOVERY | SF_NO_RETRY) == ERESTART)
>> +				goto outr;
>> 		}
>> 		if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
>> 			/* Don't wedge the queue */
>> @@ -1735,8 +1736,9 @@ probe_device_check:
>> 		struct ccb_scsiio *csio;
>>
>> 		if (cam_ccb_status(done_ccb) != CAM_REQ_CMP) {
>> -			cam_periph_error(done_ccb, 0,
>> -			    SF_NO_PRINT | SF_NO_RECOVERY | SF_NO_RETRY);
>> +			if (cam_periph_error(done_ccb, 0, SF_NO_PRINT |
>> +			    SF_NO_RECOVERY | SF_NO_RETRY) == ERESTART)
>> +				goto outr;
>> 		}
>> 		if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) {
>> 			/* Don't wedge the queue */
>>
> 

-- 
Alexander Motin


More information about the svn-src-head mailing list