kern/112119: system hangs when starts k3b on RELENG_6

Scott Long scottl at samsco.org
Fri Apr 27 19:30:09 UTC 2007


The following reply was made to PR kern/112119; it has been noted by GNATS.

From: Scott Long <scottl at samsco.org>
To: Nikolay Pavlov <quetzal at zone3000.net>, Thomas Quinot <thomas at FreeBSD.ORG>,
        "Ganbold.TS" <ganbold at micom.mng.net>, freebsd-stable at FreeBSD.ORG,
        mjacob at FreeBSD.ORG, linimon at FreeBSD.ORG, bug-followup at FreeBSD.ORG
Cc:  
Subject: Re: kern/112119: system hangs when starts k3b on RELENG_6
Date: Fri, 27 Apr 2007 13:19:43 -0600

 Nikolay Pavlov wrote:
 > On Friday, 27 April 2007 at 17:32:18 +0200, Thomas Quinot wrote:
 >> * Ganbold.TS, 2007-04-27 :
 >>
 >>> I tried your patch at
 >>> http://www.freebsd.org/cgi/query-pr.cgi?pr=103602&getpatch=12 and the
 >>> problem is still the same. Ssytem freezes upon start of k3b.
 >>>
 >>> I also tried your attached patch, which reverts part of rev. 1.42.2.3
 >>> and the problem is still the same, system hangs when starts k3b.
 >> Thanks, that's useful info. Please try the attached patch instead, which
 >> reverts another part of 1.42.2.3 (I'm trying to figure out exactly
 >> *which* part of this change is causing the problem).
 >>
 >> Also, were you able to capture system console output at the point where
 >> the crash occurs? We might have some indications there.
 > 
 > This patch works for me. I do not have a reboot and i am able to
 > succesfully burn a cd.
 > 
 >> Thomas.
 >>
 > 
 >> Index: atapi-cam.c
 >> ===================================================================
 >> RCS file: /space/mirror/ncvs/src/sys/dev/ata/atapi-cam.c,v
 >> retrieving revision 1.42.2.3
 >> retrieving revision 1.42.2.2
 >> diff -u -r1.42.2.3 -r1.42.2.2
 >> --- atapi-cam.c	29 Mar 2007 20:08:32 -0000	1.42.2.3
 >> +++ atapi-cam.c	6 Mar 2007 16:56:50 -0000	1.42.2.2
 >> @@ -697,39 +680,32 @@
 >>  	    csio->ccb_h.status |= CAM_AUTOSNS_VALID;
 >>  	}
 >>      } else if (request->result != 0) {
 >> -	if ((request->flags & ATA_R_TIMEOUT) != 0) {
 >> -	    rc = CAM_CMD_TIMEOUT;
 >> -	} else {
 >> -	    rc = CAM_SCSI_STATUS_ERROR;
 >> -	    csio->scsi_status = SCSI_STATUS_CHECK_COND;
 >> +	rc = CAM_SCSI_STATUS_ERROR;
 >> +	csio->scsi_status = SCSI_STATUS_CHECK_COND;
 >>  
 >> -	    if ((csio->ccb_h.flags & CAM_DIS_AUTOSENSE) == 0) {
 >> +	if ((csio->ccb_h.flags & CAM_DIS_AUTOSENSE) == 0) {
 >>  #if 0
 >> -		static const int8_t ccb[16] = { ATAPI_REQUEST_SENSE, 0, 0, 0,
 >> -		    sizeof(struct atapi_sense), 0, 0, 0, 0, 0, 0,
 >> -		    0, 0, 0, 0, 0 };
 >> -
 >> -		bcopy (ccb, request->u.atapi.ccb, sizeof ccb);
 >> -		request->data = (caddr_t)&csio->sense_data;
 >> -		request->bytecount = sizeof(struct atapi_sense);
 >> -		request->transfersize = min(request->bytecount, 65534);
 >> -		request->timeout = csio->ccb_h.timeout / 1000;
 >> -		request->retries = 2;
 >> -		request->flags = ATA_R_QUIET|ATA_R_ATAPI|ATA_R_IMMEDIATE;
 >> -		hcb->flags |= AUTOSENSE;
 >> +	    static const int8_t ccb[16] = { ATAPI_REQUEST_SENSE, 0, 0, 0,
 >> +		sizeof(struct atapi_sense), 0, 0, 0, 0, 0, 0,
 >> +		0, 0, 0, 0, 0 };
 >> +
 >> +	    bcopy (ccb, request->u.atapi.ccb, sizeof ccb);
 >> +	    request->data = (caddr_t)&csio->sense_data;
 >> +	    request->bytecount = sizeof(struct atapi_sense);
 >> +	    request->transfersize = min(request->bytecount, 65534);
 >> +	    request->timeout = csio->ccb_h.timeout / 1000;
 >> +	    request->retries = 2;
 >> +	    request->flags = ATA_R_QUIET|ATA_R_ATAPI|ATA_R_IMMEDIATE;
 >> +	    hcb->flags |= AUTOSENSE;
 >>  
 >> -		ata_queue_request(request);
 >> -		return;
 >> +	    ata_queue_request(request);
 >> +	    return;
 >>  #else
 >> -		/*
 >> -		 * Use auto-sense data from the ATA layer, if it has
 >> -		 * issued a REQUEST SENSE automatically and that operation
 >> -		 * returned without error.
 >> -		 */
 >> -		if (request->u.atapi.saved_cmd != 0 && request->error == 0) {
 >> -		    bcopy (&request->u.atapi.sense, &csio->sense_data, sizeof(struct atapi_sense));
 >> -		    csio->ccb_h.status |= CAM_AUTOSNS_VALID;
 >> -		}
 >> +	    /* The ATA driver has already requested sense for us. */
 >> +	    if (request->error == 0) {
 >> +		/* The ATA autosense suceeded. */
 >> +		bcopy (&request->u.atapi.sense, &csio->sense_data, sizeof(struct atapi_sense));
 >> +		csio->ccb_h.status |= CAM_AUTOSNS_VALID;
 >>  	    }
 >>  #endif
 >>  	}
 > 
 
 My best guess is that request->u.atapi.saved_cmd isn't getting preserved
 when ata_completed() does an automatic REQUEST_SENSE.  Not sure if this
 is true or why it would happen.  But if that's the case, then CAM is
 going to manually request sense, which atapi-cam and ata will likely
 treat as a normal DMA capable command.  Note that the autosense code in
 the ATA driver disables DMA for the REQUEST_SENSE command.  This might
 be a key issue; the drive might be getting very unhappy with a DMA
 flagged REQUEST_SENSE command, especially if it's already in a
 CHECK_CONDITION state.  This unhappiness might be leading to the
 interrupt storm and observed deadlock on UP system.
 
 With the patch above, sense info is reported to CAM regardless of the
 contents of saved_cmd, preventing CAM from generating the troublesome
 REQUEST_SENSE on its own.
 
 Oh hell, I know exactly what the problem is!  The opcode for a
 TEST_UNIT_READY is 0x00.  This is probably the command that is
 generating the CHECK_CONDITION.  The test for saved_cmd is entirely
 bogus.  What really needs to happen if for ATA to have an "autosense
 valid" flag in the request.  But without that, the best that you can
 do is to just ignore the contents of saved_cmd and also zero out
 request->u.atapi.sense before issuing every command.
 
 Scott


More information about the freebsd-bugs mailing list