card state dump during tape backup

Georg Altmann galtmann at las-cad.com
Thu Jan 20 12:01:09 PST 2005


Hi,

I am having troubles with our tape backup. The system has been running 
stable for about 3 month now after a tape drive failure/replacement and 
maybe for one year before that. The system (hardware/kernel/software) has 
not been touched during this period. A week ago the backup software 
(Netvault 7.1) showed the tape drive being offline and I found a scsi card 
dump in the logs (see below). The problem recurred almost daily the past 
few days during the differential backups (1-3 GB maybe). Still I am able to 
perform larger test backups (10 GB) without the problem occurring.
After restarting netvault and doing a camcontrol reset all, the backup is 
running fine again for 12h to 48h.

We are using an ADIC Fastor 2 tape library with a LTO1 tape drive attached 
to an Adaptec 29160 ultra160 adapter. FreeBSD is 4.10-RELEASE-p4. Machine 
details see below.

Although I don't suspect this to be the cause: termination is set to 
automatic for LVD in the adapter BIOS (68pin external connector) and an 
active terminator is plugged on the library side. I checked cables again 2 
days ago.

My first guess was to disable wide negotiation for all devices and set the 
tape drive to 40 MB/s in the adapter BIOS, but I doesn't seem to help.

What is this? Is the hardware starting to fail?
And if so is likely to be something on the SCSI bus e.g. the drive or is it 
the motherboard or the SCSI card failing?
I mean it is very unlikely for it to be some kind of kernel or software 
failure, right?

Any help would be greatly appreciated.

Best regards,
Georg


# pciconf -lv
ahc0 at pci0:12:0: class=0x010000 card=0xe2a09005 chip=0x00809005 rev=0x02 
hdr=0x00
    vendor   = 'Adaptec'
    device   = '29160/N/LP Ultra160 (AIC-7892A) SCSI Host Adapter'
    class    = mass storage
    subclass = SCSI

# camcontrol devlist
<ADIC FastStor 2 A12r>             at scbus0 target 0 lun 0 (pass0)
<HP Ultrium 1-SCSI E33A>           at scbus0 target 5 lun 0 (pass1,sa0)

relevant (?) kernel options:
options       SCSI_DELAY=8000
options       CAMDEBUG
options       AHC_ALLOW_MEMIO

/var/run/dmesg.boot:
CPU: AMD Athlon(tm) Processor (1102.51-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x642  Stepping = 2
 
Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
  AMD Features=0xc0440000<RSVD,AMIE,DSP,3DNow!>
real memory  = 805240832 (786368K bytes)
avail memory = 780029952 (761748K bytes)
pci0: <VIA 83C572 USB controller> at 7.2 irq 9
pci0: <VIA 83C572 USB controller> at 7.3 irq 9
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xbc00-0xbcff mem 
0xdfffb000-0xdfffbfff irq 5 at device 12.0 on pci0
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
sa0 at ahc0 bus 0 target 5 lun 0
sa0: <HP Ultrium 1-SCSI E33A> Removable Sequential Access SCSI-3 device
sa0: 40.000MB/s transfers (40.000MHz, offset 15)
pass0 at ahc0 bus 0 target 0 lun 0
pass0: <ADIC FastStor 2 A12r> Removable Changer SCSI-2 device
pass0: 3.300MB/s transfers

Here comes the dump:
Jan 19 22:46:01 asterix /kernel: ahc0: Recovery Initiated
Jan 19 22:46:01 asterix /kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins 
<<<<<<<<<<<<<<<<<
Jan 19 22:46:01 asterix /kernel: ahc0: Dumping Card State while idle, at 
SEQADDR 0x9
Jan 19 22:46:01 asterix /kernel: Card was paused
Jan 19 22:46:01 asterix /kernel: ACCUM = 0x4, SINDEX = 0x7, DINDEX = 0x21, 
ARG_2 = 0x3
Jan 19 22:46:01 asterix /kernel: HCNT = 0x0 SCBPTR = 0x1
Jan 19 22:46:01 asterix /kernel: SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] 
SCSIBUSL[0x0]
Jan 19 22:46:01 asterix /kernel: LASTPHASE[0x1]:(P_BUSFREE) 
SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)
Jan 19 22:46:01 asterix /kernel: SBLKCTL[0xa]:(SELWIDE|SELBUSB) 
SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE)
Jan 19 22:46:01 asterix /kernel: 
SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x0]
Jan 19 22:46:01 asterix /kernel: SSTAT1[0x8]:(BUSFREE) SSTAT2[0x0] 
SSTAT3[0x0] SIMODE0[0x8]:(ENSWRAP)
Jan 19 22:46:01 asterix /kernel: 
SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) SXFRCTL0[0x80]:(DFON)
Jan 19 22:46:01 asterix /kernel: DFCNTRL[0x0] 
DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
Jan 19 22:46:01 asterix /kernel: STACK: 0x0 0x164 0x179 0x3
Jan 19 22:46:01 asterix /kernel: SCB count = 20
Jan 19 22:46:01 asterix /kernel: Kernel NEXTQSCB = 4
Jan 19 22:46:01 asterix /kernel: Card NEXTQSCB = 4
Jan 19 22:46:01 asterix /kernel: QINFIFO entries:
Jan 19 22:46:01 asterix /kernel: Waiting Queue entries:
Jan 19 22:46:01 asterix /kernel: Disconnected Queue entries: 1:14
Jan 19 22:46:01 asterix /kernel: QOUTFIFO entries:
Jan 19 22:46:01 asterix /kernel: Sequencer Free SCB List: 0 2 3 4 5 6 7 8 9 
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Jan 19 22:46:01 asterix /kernel: Sequencer SCB Info:
Jan 19 22:46:01 asterix /kernel: 0 SCB_CONTROL[0xc0]:(DISCENB|TARGET_SCB) 
SCB_SCSIID[0x7]
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 1 SCB_CONTROL[0x44]:(DISCONNECTED|DISCENB) 
SCB_SCSIID[0x7]
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0x0] SCB_TAG[0xe]
Jan 19 22:46:01 asterix /kernel: 2 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 3 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 4 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 5 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 6 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 7 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 8 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 9 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 10 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 11 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 12 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 13 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 15 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 16 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 17 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 18 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 19 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 20 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 21 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 22 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 23 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 24 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 25 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 26 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 27 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 28 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 29 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 30 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: 31 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff]:(TWIN_CHNLB|OID|TWIN_TID)
Jan 19 22:46:01 asterix /kernel: SCB_LUN[0xff]:(SCB_XFERLEN_ODD|LID) 
SCB_TAG[0xff]
Jan 19 22:46:01 asterix /kernel: Pending list:
Jan 19 22:46:01 asterix /kernel: 14 SCB_CONTROL[0x40]:(DISCENB) 
SCB_SCSIID[0x7] SCB_LUN[0x0]
Jan 19 22:46:01 asterix /kernel: Kernel Free SCB list: 9 15 16 17 18 19 0 1 
2 3 5 6 7 8 13 12 11 10
Jan 19 22:46:01 asterix /kernel: Untagged Q(0): 14
Jan 19 22:46:01 asterix /kernel:
Jan 19 22:46:01 asterix /kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends 
>>>>>>>>>>>>>>>>>>
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): SCB 0xe - timed out
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Queuing a BDR SCB
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): Bus Device Reset 
Message Sent
Jan 19 22:46:01 asterix /kernel: (pass0:ahc0:0:0:0): no longer in timeout, 
status = 34b
Jan 19 22:46:01 asterix /kernel: ahc0: Bus Device Reset on A:0. 1 SCBs 
aborted
Jan 19 22:46:01 asterix /kernel: ahc0: Timedout SCBs already complete. 
Interrupts may not be functioning.



More information about the freebsd-scsi mailing list