AIC7902 SCSI Timeouts and Dumps

cp cp at olympus.net
Sun May 18 20:31:13 PDT 2003


This is partly a repost. I want to try one more time before
filling out the bug report.

This new system gets SCSI timeouts which cause drives to be
dropped while running. It runs rock solid when
booted from the IDE drive. It works perfectly under W2K.

I've tried 4.8 April 3 and 5.0 January. On 4.8 the Problem
is quite random and becomes less frequent when
controller drive speed is set down to 160/80Mhz. It is still
not stable enough to go into production. On 5.0 it's quite
hopeless. The panics are various related to what disk data
is needed at the time but all point back to some timing on the
controller. I've run Channel A and B with drives seperated
or on same channel.

The only consistent clue is that ahd1 gives a 'card paused'
and Card Dump on 4.8 during boot (see below). I have
tried every BIOS, hardware disable and SCSI utilility possible
with no luck. I can make it better but not good enough. The system
worked for the vendor but they don't do Unix. Adaptec
doesn't officially support FreeBSD due to it being embedded.
They provide drivers for Red Hat, SUSE, W2K, DOS etc.

Unless someone just happens to know something about
dumps from the ahd driver, I realize that asking to evaluate
the information I've collected is over the line so I'm not
including that information or asking that question. My
question ends up being, How do I determine what is
being worked on for state-of-the-art hardware to avoid
continuing to take potshots at reinstallations, upgrades
and reconfigurations (the bug report list does not have
this problem listed)?

I've put absurd hours into this system and never had
such a trying experience with FreeBSD... but I've never
needed any support before. The only thing I can say
for sure is the machine fails on FreeBSD and works
on W2k server. Sadly all the people selling hardware
care only about the latter.

Hardware:
Supermicro 7043A-8R (X5DA8 Mbd, 2 Xeon 2.6Ghz, 2 GB, AIC7902,
Super GEM 318, E7505), 2 Seagate Cheetah ST336753LC and 1
WD1201AD IDE.

Pertinent part of dmesg  (this is just the Dump Card State
and Card Paused that occurs at every boot):

ahd1: PCI error Interrupt
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
ahd1: Dumping Card State at program address 0x90 Mode 0x33
Card was paused
HS_MAILBOX[0x0] INTCTL[0x0] SEQINTSTAT[0x0] SAVED_MODE[0x0]
DFFSTAT[0x30]:(CURRFIFO_0|FIFO0FREE|FIFO1FREE) SCSISIGI[0x0]:(P_DATAOUT)
SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE)
SCSISEQ0[0x0] SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI)
SEQCTL0[0x10]:(FASTMODE) SEQINTCTL[0x80]:(INTVEC1DSL)
SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] SSTAT0[0x0] SSTAT1[0x8]:(BUSFREE)
SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0]
SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO)
LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0]
LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 16 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0x0 NEXTSCB 0x0
qinstart = 0 qinfifonext = 0
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
Total 0
Kernel Free SCB list: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:

ahd1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0, LJSCB 0xff00
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS
AVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
ahd1: FIFO1 Free, LONGJMP == 0x8072, SCB 0x0, LJSCB 0xff00
SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENS
AVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0
ahd1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42
ahd1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0

SIMODE0[0x6c]:(ENOVERRUN|ENIOERR|ENSELDI|ENSELDO)
CCSCBCTL[0x0]
ahd1: REG0 == 0xe735, SINDEX = 0x33, DINDEX = 0x0
ahd1: SCBPTR == 0x1ff, SCB_NEXT == 0xff00, SCB_NEXT2 == 0x0
CDB ff 1 0 0 0 0
STACK: 0x1 0x8 0x7 0x6 0x5 0x4 0x3 0x2e
>>>>>>>>>>>>>>>>>
ahd1: Signaled Target Abort





More information about the freebsd-questions mailing list