Kernel-SCSI crash. (Serious bug since 5.0.11?)

Doug Ledford dledford at redhat.com
Fri Jul 9 09:51:58 PDT 1999


Nick Taylor wrote:
> 
> Hi
> 
> I am still seeing 3940 problems, I get a similar lock up. For me it is as
> soon as I try to access 2 hds at the same time.
> 
> However it appears that not all 3940Ws fail as some people have indicated
> that they are using them OK.
> 
> My problem also appeared with kernel 2.0.36, 2.0.33 being OK. I am also
> convinced that something has been broken, but sadly am not a C hacker so
> don't know how this problem can be resolved.
> 
> Nick
> ---
> 
> Stephan Loescher wrote:
> 
> > Hi!
> >
> > I have found a bug, that appeares first in the aic7xxx-Code in Linux
> > 2.0.34 (5.0.14/3.2.4) and is there up to recent 2.3.xx-kernels! The
> > aic7xxx-Code in Linux 2.0.33 (4.1.1/3.2.1) runs stable for me.
> >
> > The sympoms:
> > When I copy a lot of large files from my harddisk (IBM DCAS-34330W) to
> > my magneto-optical (MO) drive, then after some time (5 seconds to
> > several minutes) the Linux kernel stops. The system is freezed (locked
> > up) and the SCSI-bus led, the MO-led and the harddisk-led is lighting. I
> > can´t log into my system. Mouse and keyboard are "dead".
> > The source-and target-filesystems are ext2.
> > I can reproduce this behaviour.
> > I can copy files between all my harddisks without any error.
> > With kernel 2.0.33 there are no problems!
> >
> > I nailed it down with linux/Documentation/BUG-HUNTING to the
> > aic7xxx-Code, because when I replace the aic7xxx-files in 2.0.34 with
> > the files from 2.0.33, then the system runs stable.
> >
> > I have tried the following kernels:
> > 2.0.34
> > 2.0.35
> > 2.0.36
> > 2.1.128
> > 2.2.2
> > 2.2.5
> > 2.2.6
> > 2.2.7
> > 2.2.10
> > 2.3.4
> > (with and without all AC-patches)
> >
> > Also disabling all aic7xxx-features does not help.
> > I tried these options:
> > aic7xxx=verbose, aic7xxx=pci_parity, aic7xxx=verbose:0x1ffff
> > and disabled TAGGED_QUEUEING at all.
> >
> > To help you finding the bug, I tried all aic7xxx-patches for Linux
> > 2.0.33 from the last 4.x.x up to 5.0.13. The results are:
> >
> > 5.0.0 /3.2.2: OK
> > 5.0.1 /3.2.2: does not boot, seems _very_ unstable
> > 5.0.10/3.2.2: OK
> > 5.0.11/3.2.2: Makes endless SCSI-resets after issuing commands like
> >               echo "scsi remove-single-device 0 0 1 0 " >/proc/scsi/scsi
> > 5.0.12/3.2.2: locks up the system as 5.0.14 does!
> > 5.0.13/3.2.2: locks up the system as 5.0.14 does!
> >
> > My system:
> > Pentium-200 (single-CPU)
> > SCSI-HA: Adaptec 3490U, Bios 1.24
> > Channel A:
> > 0 : CD Sony CDU-76S
> > 1 : HD Seagate ST32430N
> > 3 : CDRW Yamaha CRW4416S 1.0f
> > 4 : Streamer Tandberg NS20 Pro
> > 5 : HD IBM DCAS-34330
> > 6 : HD IBM DCAS-34330W
> > (End of SCSI-bus with active termination, and AHA with auto-termination.)
> > Channel B:
> > 0 : Olympus Deltis-MOS320 (MO)
> > 3 : HP ScanJet
> > (End of SCSI-bus with passive termination, and AHA with auto-termination.)
> >
> > What was changed in the aic7xxx-code after 5.0.10/3.2.2?
> >
> > What can I do to help you finding the bug?
> >
> > Stephan.

OK, I must have missed this report somehow.  Anyway, the big item of change
between the 5.0.10 and later versions is that all later versions default to
using MMAPIO instead of PIO.  So, if you want to test things out, go into the
aic7xxx.c file, find the line that reads:

#define MMAPIO

and comment that line out then recompile.  That should disable MMAPed I/O on
your system and that will let us know if your problem is related to
simultaneous I/O to different MMAP regions on the card.  Note, there may be
more than one line of the #define MMAPIO in your source code, but assuming you
are using an Intel based machine, you need only find the one in the #ifdef
__i386__ block of code.  You can ignore the other architectures.  I would give
and exact line number, but depending on which patch you use this could be
greatly different as that section of code was in a state of flux during the
5.0.10->14 days.


-- 
  Doug Ledford   <dledford at redhat.com>
   Opinions expressed are my own, but
      they should be everybody's.


To Unsubscribe: send mail to majordomo at FreeBSD.org
with "unsubscribe aic7xxx" in the body of the message




More information about the aic7xxx mailing list