Tracking down scsi parity errors.

Alex Finch A.Finch at lancaster.ac.uk
Fri Aug 27 08:22:49 PDT 2004


  Hi,

    My basic question is

   Is it possible to work out the cause of SCSI parity errors?

  I basically need to know if I am justified to ask for a replacement tape drive 
from my supplier, or whether there can be another cause.

  I have a spectralogic 2000 tape library, which contains a SONY drive:
   Vendor: SONY     Model: SDX-500C         Rev: 0200
   Type:   Sequential-Access                ANSI SCSI revision: 02

It is connected via an Adaptec 29160 card.

  The library  recently went wrong and was replaced. I then spent a long time 
trying to get the robot part to work correctly finally found that using mtx 
version 1.2.17 and not 1.3.x provided reliable robot control.

  The I came to try a (now long overdue!)  backup and I started getting scsi 
parity errors.

  I find that under kernel version  2.4.13 I get occasional errors in the log like:

==================================================================================
Aug 27 16:06:54 lapg kernel: scsi0: PCI error Interrupt at seqaddr = 0x8a
Aug 27 16:06:54 lapg kernel: scsi0: Data Parity Error Detected during address or 
write data phase
Aug 27 16:07:54 lapg kernel: scsi0: PCI error Interrupt at seqaddr = 0x8e
Aug 27 16:07:54 lapg kernel: scsi0: Data Parity Error Detected during address or 
write data phase
=====================================================================================

but it APPEARS to carry on working, however under more recent kernels, 2.4.23 or 
  2.4.26 it gives up completely straight away, the log contains:

==============================================================================

Aug 27 09:30:15 lapg last message repeated 13 times
Aug 27 09:30:15 lapg kernel: ^IUnexpected non-DT Data Phase
Aug 27 09:30:15 lapg kernel: scsi0:A:15: parity error detected while idle. 
SEQADDR(0x1a) SCSIRATE(0x0)
Aug 27 09:30:15 lapg last message repeated 2 times
Aug 27 09:30:15 lapg kernel: scsi0: Unexpected busfree while idle
Aug 27 09:30:15 lapg kernel: SEQADDR == 0x1a
Aug 27 09:30:15 lapg kernel: scsi0:A:15: parity error detected while idle. 
SEQADDR(0x18) SCSIRATE(0x0)
Aug 27 09:30:15 lapg kernel: ^IUnexpected non-DT Data Phase
Aug 27 09:30:15 lapg kernel: (scsi0:A:3:0): No or incomplete CDB sent to device.
Aug 27 09:30:15 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:30:30 lapg kernel: (scsi0:A:4:0): No or incomplete CDB sent to device.
Aug 27 09:30:30 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:30:45 lapg kernel: (scsi0:A:5:0): No or incomplete CDB sent to device.
Aug 27 09:30:45 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:00 lapg kernel: (scsi0:A:6:0): No or incomplete CDB sent to device.
Aug 27 09:31:00 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:15 lapg kernel: (scsi0:A:8:0): No or incomplete CDB sent to device.
Aug 27 09:31:15 lapg kernel: scsi0: Issued Channel A Bus Reset. 1 SCBs aborted
Aug 27 09:31:30 lapg kernel: scsi0:A:15: parity error detected while idle. 
SEQADDR(0x18) SCSIRATE(0x0)
:
======================================================================================

at which point the application trying to do the write to tape gives up with an 
error.

  I only get problems when writing, reading, and positioning the tape all work ok.

  I have tried things like different cables and terminators etc., problem does 
not go away.

	Grateful for any help!


			Alex Finch
-- 

  Alex Finch, Research Fellow, Physics Department, Lancaster University.



More information about the aic7xxx mailing list