AIC7902 w/ seagate U320 drive issue on releng-4 (and current)

Justin T. Gibbs gibbs at scsiguy.com
Sun Jul 27 21:05:42 PDT 2003


> I wonder if the driver could back-off or do some test first.

This would be something for the CAM transport layer to do.  Right
now, it only throttles based on the drive reporting queue full status.
It would be possible to also have the transport layer track errors
and throttle based on that, but classifying errors that should
result in a "speed throttle" versus a "tag throttle" would be
tricky. 

> interestingly, 004 was fine until a recent driver rev (or at
> least, the problem did not manifest).

The driver has been getting faster due to some recent optimizations.
Sorry. 8-)

> Why would the behaviour be such that the drive disappears from
> the SCSI chain and not even a system reset fixes it?

Some versions of the firmware were shipped with a diagnostic
feature enabled that causes the equivalent of an assert.  This
is great for firmware engineers since the drive stops dead
at the location of the error.  It's not so good for end users.

> I'm very
> surprised that resetting the motherboard doesn't reset the drive,
> only a powercycle does in this case.

Why is this surprising?  What does reseting the motherboard do
to the drive?  A SCSI bus reset may occur, but that should occur
during the drivers error recover anyway.  Out to lunch drives
typically only come back with a power cycle.

> Why would the 160 version of the same drive not have the same
> bug? I guess that's a question for seagate :)

The 160 version does not support packetized protocol.  This is
a U320 feature.  Seagate introduced some bugs in support this
new protocol mode.

> Should i just drop the number of tags down to 32 or 64 on spec,
> or is there another cause likely?

I have yet to see the failure running these drives at 32 tags.

--
Justin



More information about the aic7xxx mailing list