ZFS related kernel panic

Alexander Motin mav at FreeBSD.org
Thu Sep 16 10:58:40 UTC 2010


Alexander Motin wrote:
> It looks like during timeout handling (it is quite complicated process
> when port multiplier is used) some request was completed twice. So
> original problem is probably in hardware (try to check/replace cables,
> multiplier, ...), that caused timeout, but the fact that drive was
> unable to handle it is probably a siis(4) driver bug.

Thanks to console access provided, I have found the reason of crash.
Attached patch should fix it. Patched system successfully runs the
stress test for 45 minutes now, comparing to crashing in few minutes
without it.

Also I've found that timeouts reported by the driver are not fatal.
Affected commands are correctly completing as soon as after detecting
time out driver freezes new incoming requests to resolve situation, and
as result, idling the bus. ones. These timeouts I think caused by some
congestion on SATA interface, that probably caused by port multiplier.
This panic could be triggered only by such fake timeouts, not the real

-- 
Alexander Motin
-------------- next part --------------
--- siis.c.debug	2010-09-16 11:11:59.000000000 +0100
+++ siis.c	2010-09-16 11:12:31.000000000 +0100
@@ -1209,6 +1209,7 @@ siis_end_transaction(struct siis_slot *s
 	device_t dev = slot->dev;
 	struct siis_channel *ch = device_get_softc(dev);
 	union ccb *ccb = slot->ccb;
+	int lastto;
 
 	mtx_assert(&ch->mtx, MA_OWNED);
 	bus_dmamap_sync(ch->dma.work_tag, ch->dma.work_map,
@@ -1292,11 +1293,6 @@ siis_end_transaction(struct siis_slot *s
 	ch->oslots &= ~(1 << slot->slot);
 	ch->rslots &= ~(1 << slot->slot);
 	ch->aslots &= ~(1 << slot->slot);
-	if (et != SIIS_ERR_TIMEOUT) {
-		if (ch->toslots == (1 << slot->slot))
-			xpt_release_simq(ch->sim, TRUE);
-		ch->toslots &= ~(1 << slot->slot);
-	}
 	slot->state = SIIS_SLOT_EMPTY;
 	slot->ccb = NULL;
 	/* Update channel stats. */
@@ -1305,6 +1301,13 @@ siis_end_transaction(struct siis_slot *s
 	    (ccb->ataio.cmd.flags & CAM_ATAIO_FPDMA)) {
 		ch->numtslots[ccb->ccb_h.target_id]--;
 	}
+	/* Cancel timeout state if request completed normally. */
+	if (et != SIIS_ERR_TIMEOUT) {
+		lastto = (ch->toslots == (1 << slot->slot));
+		ch->toslots &= ~(1 << slot->slot);
+		if (lastto)
+			xpt_release_simq(ch->sim, TRUE);
+	}
 	/* If it was our READ LOG command - process it. */
 	if (ch->readlog) {
 		siis_process_read_log(dev, ccb);


More information about the freebsd-fs mailing list