2.1.8X boot hangs on 2940UW controller...

Robert G. Brown rgb at phy.duke.edu
Mon Mar 9 13:33:14 PST 1998


Dear All,

I have been having severe stability problems with 2.0.33, probably
associated with a race condition on the fast ethernet interrupt channel
that leads either to a deadlock or a corrupted kernel (and either way to
a nasty crash) whenever my dual PII's or dual PPro's are hit with a high
packet density (anything much bigger than 25 Kpps).

One suggestion that was made to try to "fix" the problem with to bite
the bullet and shift to 2.1.X kernels.  I have therefore spent a couple
of days trying to get a 2.1.88 kernel (with or without
pre-patch-2.1.89-5 applied) to compile and boot on my system.  I have
upgraded modules, binutils, libc, ld.so and mount.  I configure and
compile the kernel with aic7xxx support without incident.  However, when
I try to boot the kernel, it correctly detects my onboard 2940UW
controller (I'm running a SuperMicro P6DLS dual PII with onboard
2940UW), initializes it, and gets to the message:

(scsi0:0:-1:-1) Scanning channel for devices.

and hangs.  The disk activity LED lights up and stays lit.  It never
comes back and I have to eventually reboot into 2.0.33 (or 2.0.30,31,32)
all of which work perfectly as far as the Adaptec controller is
concerned.

Since the BIOS correctly reports all my devices, and 2.0.33 out of the
box or with aic7xxx 5.0.[5,7] works fine with all devices addressable, I
assume that my actual hardware configuration is OK (termination and all
that).  I could be wrong, of course, but I have to reason to think that
the scsi controller or any scsi device, particularly the hard disks,
are not working correctly.

As an experiment, I installed the 5.0.5 aic7xxx driver sources under the
2.1.88 kernel.  The installation was fairly straightforward except for
linux/printk.c, where it was difficult for me to isolate the
modifications that might have been made for the aic7xxx driver from the
considerable modifications associated with 2.1.X.  In order to get a
good compile, I left the standard 2.1.88 printk.c in place and hoped
that I'd get past the channel scan if the problem was resolved in the
5.0.5 driver; I could always patch printk the rest of the way later if I
got a good boot.  The kernel with 5.0.5 aic7xxx drivers compiled without
incident, booted, and hung in exactly the same spot.  I can generate
diffs for the 2.1.X kernel sources and the aic7xxx 5.0.5 sources EXCEPT
for kernel/printk.c, which remains to be done, fairly easily if anybody
wants them.

SO, can anybody help me out here?  Any idea as to why the 2.1 boot hangs
scanning the scsi0 channel for devices where a 2.0 kernel does not,
running the same drivers?  Although it is somewhat difficult to mail a
complete copy of the console messages that come up when I'm booting
(most of which looks "normal" except for a lot of stuff associated with
IO-APIC remapping of IRQ's), I can certainly write down and transcribe
anything that might be of use diagnosing the problem.

  Thanks,

       rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu




To Unsubscribe: send mail to majordomo at FreeBSD.org
with "unsubscribe aic7xxx" in the body of the message



More information about the aic7xxx mailing list