Tracking down scsi parity errors.
Alex Finch
A.Finch at lancaster.ac.uk
Fri Aug 27 08:22:49 PDT 2004
Hi,
My basic question is
Is it possible to work out the cause of SCSI parity errors?
I basically need to know if I am justified to ask for a replacement tape drive
from my supplier, or whether there can be another cause.
I have a spectralogic 2000 tape library, which contains a SONY drive:
Vendor: SONY Model: SDX-500C Rev: 0200
Type: Sequential-Access ANSI SCSI revision: 02
It is connected via an Adaptec 29160 card.
The library recently went wrong and was replaced. I then spent a long time
trying to get the robot part to work correctly finally found that using mtx
version 1.2.17 and not 1.3.x provided reliable robot control.
The I came to try a (now long overdue!) backup and I started getting scsi
parity errors.
I find that under kernel version 2.4.13 I get occasional errors in the log like:
==================================================================================
Aug 27 16:06:54 lapg kernel: scsi0: PCI error Interrupt at seqaddr = 0x8a
Aug 27 16:06:54 lapg kernel: scsi0: Data Parity Error Detected during address or
write data phase
Aug 27 16:07:54 lapg kernel: scsi0: PCI error Interrupt at seqaddr = 0x8e
Aug 27 16:07:54 lapg kernel: scsi0: Data Parity Error Detected during address or
write data phase
=====================================================================================
but it APPEARS to carry on working, however under more recent kernels, 2.4.23 or
2.4.26 it gives up completely straight away, the log contains:
==============================================================================
Aug 27 09:30:15 lapg last message repeated 13 times
Aug 27 09:30:15 lapg kernel: ^IUnexpected non-DT Data Phase
Aug 27 09:30:15 lapg kernel: scsi0:A:15: parity error detected while idle.
SEQADDR(0x1a) SCSIRATE(0x0)
Aug 27 09:30:15 lapg last message repeated 2 times
Aug 27 09:30:15 lapg kernel: scsi0: Unexpected busfree while idle
Aug 27 09:30:15 lapg kernel: SEQADDR == 0x1a
Aug 27 09:30:15 lapg kernel: scsi0:A:15: parity error detected while idle.
SEQADDR(0x18) SCSIRATE(0x0)
Aug 27 09:30:15 lapg kernel: ^IUnexpected non-DT Data Phase
Aug 27 09:30:15 lapg kernel: (scsi0:A:3:0): No or incomplete CDB sent to device.
Aug 27 09:30:15 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:30:30 lapg kernel: (scsi0:A:4:0): No or incomplete CDB sent to device.
Aug 27 09:30:30 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:30:45 lapg kernel: (scsi0:A:5:0): No or incomplete CDB sent to device.
Aug 27 09:30:45 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:00 lapg kernel: (scsi0:A:6:0): No or incomplete CDB sent to device.
Aug 27 09:31:00 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:15 lapg kernel: (scsi0:A:8:0): No or incomplete CDB sent to device.
Aug 27 09:31:15 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:30 lapg kernel: scsi0:A:15: parity error detected while idle.
SEQADDR(0x18) SCSIRATE(0x0)
:
======================================================================================
at which point the application trying to do the write to tape gives up with an
error.
I only get problems when writing, reading, and positioning the tape all work ok.
I have tried things like different cables and terminators etc., problem does
not go away.
Grateful for any help!
Alex Finch
--
Alex Finch, Research Fellow, Physics Department, Lancaster University.
More information about the aic7xxx
mailing list