FreeBSD 4.8, ASR2120, SMP, degraded RAID1/mirror => storage failure
rysanek at fccps.cz
rysanek at fccps.cz
Fri Sep 5 00:20:44 PDT 2003
Dear Mr. Long,
firstly, let me thank you for maintaining the Adaptec RAID drivers.
I've got a problem with the Adaptec 2120S in FreeBSD 4.8-RELEASE
and I haven't found any notes about that in the mailing lists.
In SMP mode, upon a RAID array degradation event (a disk is
ripped out), the system locks up almost entirely, stuck at
disk operations.
The same happens upon boot with a rebuilding/degraded array
- building from scratch or rebuilding after a disk failure,
or even just running off a single disk while the other one
is dead (no rebuild going on in the background).
The problem doesn't occur in UP mode (when options SMP and
APIC_IO are off) - that way the host system works happily
just as if there was nothing wrong with the array (except
for a few **Monitor** warnings and the LEDs going disco).
The problem also doesn't occur as long as the RAID is
"optimal".
The problem was only observed and tested in a configuration
with two disks in a mirror (one or two logical "containers"
on them), no hot spare.
My system configuration is:
2x Intel P4 Xeon @ 2.4 GHz, 533MHz FSB
1 GB RAM (dual-channel, 2x 512 MB DIMM DDR266, ECC, REG)
ServerWorks GC-LE chipset, PCI-X 64bit at 133MHz
2x3 SCA backplane with two GEM318 SAF-TE processors
On one channel, there are two pieces of Seagate ST336607LC (36 GB)
(+ 2x onboard BCM570x GbETH, 2x onboard AIC7902,
onboard ATI RageXL PCI 8MB, etc)
The array on the AAC controller is the only disk drive in the
system -> the machine is booting from it.
To the best of my knowledge, the mechanical and electrical
parts of the U320 system are fine - they've been working for
me in Linux and with other SCSI controllers just fine, after
all the dual-channel onboard U320 HBA works just fine, too.
Attached is a tarball with debugging logs.
There are three directories, containing three different
combinations of debug options (see below items A to C).
Each directory contains six log files: a boot from a clean
array, a disk failure (somewhat improperly simulated by
ripping the SCA enclosure out), and a boot from a degraded
volume - all of that for a UP and SMP kernel. 3*2=6.
I've tried the following different debugging options and levels:
A) full CAM debug and AAC_DEBUG=2
B) AAC_DEBUG=2
C) AAC_DEBUG=4 (after I found in the sources that L4 exists)
With A), everything worked as described above.
Just the CAM debugging messages probably cluttered the
kernel ring buffer to the extent that some of the AAC_DEBUG
and generic messages are missing in the log, such as those
announcing the detection of /dev/aacd0 and /dev/aacd1
(the two RAID volumes/containers)
With B), upon runtime disk failure, the fault occured even in UP configuration!
-- while UP kernels without debugging continued to operate,
and even the UP kernel as per B) continued to run fine
after reboot, on the failed array.
With C), __SMP__: the machine behaved as expected (dead) upon
runtime disk failure, but consistently managed to boot with
the degraded array while it was not rebuilding (=anomaly)
- then it crashed when I logged in and told it to `reboot`.
When I plugged the disk drive back and the array started
rebuilding, the SMP kernel consistently failed to boot.
__UP__: the machine was consistently failing miserably
upon array degradation (=anomaly). It did boot fine
consistently with a degraded array (not rebuilding).
It failed at boot consistently with a rebuilding array.
So it seems that the serial logging / debugging stuff modifies
timing, and hence the behavior with debugging on is different.
Reminds me of the Heisenbergian uncertainty.
Still, without debugging, the consistent pattern is:
UP = boots fine from a clean array, survives array degradation
and boots from a degraded array.
SMP = boots fine from a clean array, does not survive array degradation
and fails to boot from a degraded array.
While I was trying to find a typical healthy "SCSI request/response"
pattern in the logs, it seemed to me that quite often some of the
debugging messages were missing, and some were clearly cut in
half or so - perhaps I should check my RS232 cabling? Though
I really think that my cabling is all right...
>From the debug listings it would seem that the AAC driver
on the host PC gets a zero-padded FIB from the controller,
and then an endless row of interrupts.
This happens immediately after a disk failure or after driver
initialization upon boot.
The following is a piece of pseudo-code for your reference,
based on /usr/src/sys/dev/aac.c. The aac_host_command() forms
the body of a kthread that gets started upon adapter
initialization. Note the line with "!!!":
aac_host_command()
{
while(true)
{
tsleep();
for (;;)
{
// check for enqueued FIBs
aac_dequeue_fib(AAC_HOST_NORM_CMD_QUEUE);
if (found one)
{
// process it
}
else
{
break; // go to sleep again
}
}
}
}
aac_dequeue_fib()
{
if (ci != pi) // consumer/producer indices
{
// there are some FIBs in the queue
// !!! at the same time, the FIB is zero-padded !!!
}
else return(ENOENT);
}
Another symptom is that, upon array degradation, the controller
seems to reset the RAID-private SCSI bus (I hope that's what
the **Monitor** message says).
The trouble is that both the aac_host_command() wakeup with the
zero-padded FIB and the monitor messages appear in asynchronous
context (in a separate kthread or in an interrupt) and I'm not
as skilled as to say which previous action of the driver is
the immediate cause.
More on the behavior of the disk LEDs:
These LEDs on my server case are controlled by the SCA/SAF-TE
chip (GEM318).
- When the array is degraded but operating normally, the dead
disk's LED is dark and the live disk's led flashes green,
indicating normal storage transfers.
- When a degraded array is rebuilding, the two disk LEDs dance
in shades of green to orange (both the green and red
pads flashing).
- When the whole controller or the RAID-private SCSI channel
is being reset, both the two LEDs shine a steady red.
- When the machine fails at boot with a rebuilding array,
the LEDs often turn red for a few seconds (reset?) and then
one of them remains red and the other one starts dancing
green/orange... and the reset may come back a few times
before the machine locks up entirely or the BSD manages
to do an auto-reboot. Or the LED's just stay red and the
machine hangs.
- When the machine boots and runs fine (i.e., with a UP kernel
under normal conditions), the disk LED's never go red, except
for a cold reset of the whole PC. When the array is rebuilding,
the LED's keep dancing merrily between green and orange
throughout the boot process.
I guess this would indicate that it's not just the BSD driver
getting messed up - the controller probably also gets
seriously confused. Is that a chicken-vs.-egg style puzzle?
As a side note: it seems interesting to me that, regardless
of whethere debugging and SMP is on or off in any particular
combination, the kernel always rushes through to
"Waiting 15 seconds for the SCSI devices to settle"
and _immediately_ reports the RAID containers.
Only then it waits those fifteen seconds before proceeding
to detect the regular SCSI devices.
Attached is my kernel config file and a listing of
`lspci -lv`
I can't think of anything else to tell you at the moment.
Ask me if you need further help - perhaps I can modify the
debugging flags and try again, add some more instrumentation
hooks here and there to focus on particular points in the
code etc.
Any ideas are welcome.
Sorry about wasting your time by sending such an eloquent
explanation.
Thanks for the great job that you're doing.
Frank Rysanek
-------------- next part --------------
chip0 at pci0:0:0: class=0x060000 card=0x00000000 chip=0x00141166 rev=0x31 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CNB20-HE Host Bridge'
class = bridge
subclass = HOST-PCI
chip1 at pci0:0:1: class=0x060000 card=0x00000000 chip=0x00141166 rev=0x00 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CNB20-HE Host Bridge'
class = bridge
subclass = HOST-PCI
chip2 at pci0:0:2: class=0x060000 card=0x00000000 chip=0x00151166 rev=0x00 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CMIC-GC Hostbridge and MCH'
class = bridge
subclass = HOST-PCI
none0 at pci0:2:0: class=0x030000 card=0x80041002 chip=0x47521002 rev=0x27 hdr=0x00
vendor = 'ATI Technologies'
device = 'Rage XL PCI'
class = display
subclass = VGA
isab0 at pci0:15:0: class=0x060100 card=0x02011166 chip=0x02011166 rev=0x93 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CSB5 PCI to ISA Bridge'
class = bridge
subclass = PCI-ISA
atapci0 at pci0:15:1: class=0x01018a card=0x02121166 chip=0x02121166 rev=0x93 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CSB5 PCI EIDE Controller'
class = mass storage
subclass = ATA
ohci0 at pci0:15:2: class=0x0c0310 card=0x02201166 chip=0x02201166 rev=0x05 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'OSB4 OpenHCI Compliant USB Controller'
class = serial bus
subclass = USB
chip3 at pci0:15:3: class=0x060000 card=0x02301166 chip=0x02251166 rev=0x00 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CSB5 PCI Bridge'
class = bridge
subclass = HOST-PCI
chip4 at pci0:17:0: class=0x060000 card=0x00000000 chip=0x01011166 rev=0x03 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CIOB-X2'
class = bridge
subclass = HOST-PCI
chip5 at pci0:17:2: class=0x060000 card=0x00000000 chip=0x01011166 rev=0x03 hdr=0x00
vendor = 'Reliance Computer Corp./ServerWorks'
device = 'CIOB-X2'
class = bridge
subclass = HOST-PCI
aac0 at pci3:4:0: class=0x010400 card=0x02869005 chip=0x02859005 rev=0x01 hdr=0x00
vendor = 'Adaptec'
device = 'AAC-RAID RAID Controller'
class = mass storage
subclass = RAID
bge0 at pci4:2:0: class=0x020000 card=0x000814e4 chip=0x164514e4 rev=0x15 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'BCM5701 NetXtreme Gigabit Ethernet'
class = network
subclass = ethernet
bge1 at pci4:3:0: class=0x020000 card=0x000814e4 chip=0x164514e4 rev=0x15 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'BCM5701 NetXtreme Gigabit Ethernet'
class = network
subclass = ethernet
ahd0 at pci4:4:0: class=0x010000 card=0x005e9005 chip=0x801d9005 rev=0x10 hdr=0x00
vendor = 'Adaptec'
class = mass storage
subclass = SCSI
ahd1 at pci4:4:1: class=0x010000 card=0x005e9005 chip=0x801d9005 rev=0x10 hdr=0x00
vendor = 'Adaptec'
class = mass storage
subclass = SCSI
-------------- next part --------------
#
# GENERIC -- Generic kernel configuration file for FreeBSD/i386
#
# For more information on this file, please read the handbook section on
# Kernel Configuration Files:
#
# http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
#
# The handbook is also available locally in /usr/share/doc/handbook
# if you've installed the doc distribution, otherwise always see the
# FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the
# latest information.
#
# An exhaustive list of options and more detailed explanations of the
# device lines is also present in the ./LINT configuration file. If you are
# in doubt as to the purpose or necessity of a line, check first in LINT.
#
# $FreeBSD: src/sys/i386/conf/GENERIC,v 1.246.2.51.2.2 2003/03/25 23:35:15 jhb Exp $
machine i386
#cpu I386_CPU
#cpu I486_CPU
#cpu I586_CPU
cpu I686_CPU
ident GENERIC
maxusers 0
#makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols
options MATH_EMULATE #Support for x87 emulation
options INET #InterNETworking
#options INET6 #IPv6 communications protocols
options FFS #Berkeley Fast Filesystem
options FFS_ROOT #FFS usable as root device [keep this!]
options SOFTUPDATES #Enable FFS soft updates support
options UFS_DIRHASH #Improve performance on big directories
options MFS #Memory Filesystem
options MD_ROOT #MD is a potential root device
options NFS #Network Filesystem
options NFS_ROOT #NFS usable as root device, NFS required
options MSDOSFS #MSDOS Filesystem
options CD9660 #ISO 9660 Filesystem
options CD9660_ROOT #CD-ROM usable as root, CD9660 required
options PROCFS #Process filesystem
options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!]
options SCSI_DELAY=15000 #Delay (in ms) before probing SCSI
options UCONSOLE #Allow users to grab the console
options USERCONFIG #boot -c editor
options VISUAL_USERCONFIG #visual boot -c editor
options KTRACE #ktrace(1) support
options SYSVSHM #SYSV-style shared memory
options SYSVMSG #SYSV-style message queues
options SYSVSEM #SYSV-style semaphores
options P1003_1B #Posix P1003_1B real-time extensions
options _KPOSIX_PRIORITY_SCHEDULING
options ICMP_BANDLIM #Rate limit bad replies
options KBD_INSTALL_CDEV # install a CDEV entry in /dev
options AHC_REG_PRETTY_PRINT # Print register bitfields in debug
# output. Adds ~128k to driver.
options AHD_REG_PRETTY_PRINT # Print register bitfields in debug
# output. Adds ~215k to driver.
# To make an SMP kernel, the next two are needed
options SMP # Symmetric MultiProcessor Kernel
options APIC_IO # Symmetric (APIC) I/O
# To support HyperThreading, HTT is needed in addition to SMP and APIC_IO
options HTT # HyperThreading Technology
device isa
#device eisa
device pci
# Floppy drives
device fdc0 at isa? port IO_FD1 irq 6 drq 2
device fd0 at fdc0 drive 0
device fd1 at fdc0 drive 1
#
# If you have a Toshiba Libretto with its Y-E Data PCMCIA floppy,
# don't use the above line for fdc0 but the following one:
#device fdc0
# ATA and ATAPI devices
device ata0 at isa? port IO_WD1 irq 14
device ata1 at isa? port IO_WD2 irq 15
device ata
device atadisk # ATA disk drives
device atapicd # ATAPI CDROM drives
device atapifd # ATAPI floppy drives
device atapist # ATAPI tape drives
options ATA_STATIC_ID #Static device numbering
# SCSI Controllers
#device ahb # EISA AHA1742 family
#device ahc # AHA2940 and onboard AIC7xxx devices
device ahd # AHA39320/29320 and onboard AIC79xx devices
#device amd # AMD 53C974 (Tekram DC-390(T))
#device isp # Qlogic family
#device mpt # LSI-Logic MPT/Fusion
#device ncr # NCR/Symbios Logic
#device sym # NCR/Symbios Logic (newer chipsets)
#options SYM_SETUP_LP_PROBE_MAP=0x40
# Allow ncr to attach legacy NCR devices when
# both sym and ncr are configured
#device adv0 at isa?
#device adw
#device bt0 at isa?
#device aha0 at isa?
#device aic0 at isa?
#device ncv # NCR 53C500
#device nsp # Workbit Ninja SCSI-3
#device stg # TMC 18C30/18C50
# SCSI peripherals
device scbus # SCSI bus (required)
device da # Direct Access (disks)
device sa # Sequential Access (tape etc)
device cd # CD
device pass # Passthrough device (direct SCSI access)
# RAID controllers interfaced to the SCSI subsystem
#device asr # DPT SmartRAID V, VI and Adaptec SCSI RAID
#device dpt # DPT Smartcache - See LINT for options!
#device iir # Intel Integrated RAID
#device mly # Mylex AcceleRAID/eXtremeRAID
#device ciss # Compaq SmartRAID 5* series
# RAID controllers
device aac # Adaptec FSA RAID, Dell PERC2/PERC3
#options AAC_DEBUG=4
#device aacp # SCSI passthrough for aac (requires CAM)
#device ida # Compaq Smart RAID
#device amr # AMI MegaRAID
#device mlx # Mylex DAC960 family
#device twe # 3ware Escalade
#options CAMDEBUG
#options CAM_DEBUG_BUS=-1
#options CAM_DEBUG_TARGET=-1
#options CAM_DEBUG_LUN=-1
#options CAM_DEBUG_FLAGS="CAM_DEBUG_INFO|CAM_DEBUG_TRACE|CAM_DEBUG_SUBTRACE|CAM_DEBUG_CDB|CAM_DEBUG_XPT|CAM_DEBUG_PERIPH"
# atkbdc0 controls both the keyboard and the PS/2 mouse
device atkbdc0 at isa? port IO_KBD
device atkbd0 at atkbdc? irq 1 flags 0x1
device psm0 at atkbdc? irq 12
device vga0 at isa?
# splash screen/screen saver
pseudo-device splash
# syscons is the default console driver, resembling an SCO console
device sc0 at isa? flags 0x100
# Enable this and PCVT_FREEBSD for pcvt vt220 compatible console driver
#device vt0 at isa?
#options XSERVER # support for X server on a vt console
#options FAT_CURSOR # start with block cursor
# If you have a ThinkPAD, uncomment this along with the rest of the PCVT lines
#options PCVT_SCANSET=2 # IBM keyboards are non-std
device agp # support several AGP chipsets
# Floating point support - do not disable.
device npx0 at nexus? port IO_NPX irq 13
# Power management support (see LINT for more options)
device apm0 at nexus? disable flags 0x20 # Advanced Power Management
# PCCARD (PCMCIA) support
#device card
#device pcic0 at isa? irq 0 port 0x3e0 iomem 0xd0000
#device pcic1 at isa? irq 0 port 0x3e2 iomem 0xd4000 disable
# Serial (COM) ports
device sio0 at isa? port IO_COM1 flags 0x30 irq 4
device sio1 at isa? port IO_COM2 irq 3
device sio2 at isa? disable port IO_COM3 irq 5
device sio3 at isa? disable port IO_COM4 irq 9
options CONSPEED=115200
# Parallel port
device ppc0 at isa? irq 7
device ppbus # Parallel port bus (required)
device lpt # Printer
#device plip # TCP/IP over parallel
#device ppi # Parallel port interface device
#device vpo # Requires scbus and da
# PCI Ethernet NICs.
#device de # DEC/Intel DC21x4x (``Tulip'')
#device em # Intel PRO/1000 adapter Gigabit Ethernet Card (``Wiseman'')
#device txp # 3Com 3cR990 (``Typhoon'')
#device vx # 3Com 3c590, 3c595 (``Vortex'')
# PCI Ethernet NICs that use the common MII bus controller code.
# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
device miibus # MII bus support
#device dc # DEC/Intel 21143 and various workalikes
#device fxp # Intel EtherExpress PRO/100B (82557, 82558)
#device pcn # AMD Am79C97x PCI 10/100 NICs
#device rl # RealTek 8129/8139
#device sf # Adaptec AIC-6915 (``Starfire'')
#device sis # Silicon Integrated Systems SiS 900/SiS 7016
#device ste # Sundance ST201 (D-Link DFE-550TX)
#device tl # Texas Instruments ThunderLAN
#device tx # SMC EtherPower II (83c170 ``EPIC'')
#device vr # VIA Rhine, Rhine II
#device wb # Winbond W89C840F
#device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'')
device bge # Broadcom BCM570x (``Tigon III'')
# ISA Ethernet NICs.
# 'device ed' requires 'device miibus'
#device ed0 at isa? disable port 0x280 irq 10 iomem 0xd8000
#device ex
#device ep
#device fe0 at isa? disable port 0x300
# Xircom Ethernet
#device xe
# PRISM I IEEE 802.11b wireless NIC.
#device awi
# WaveLAN/IEEE 802.11 wireless NICs. Note: the WaveLAN/IEEE really
# exists only as a PCMCIA device, so there is no ISA attachment needed
# and resources will always be dynamically assigned by the pccard code.
#device wi
# Aironet 4500/4800 802.11 wireless NICs. Note: the declaration below will
# work for PCMCIA and PCI cards, as well as ISA cards set to ISA PnP
# mode (the factory default). If you set the switches on your ISA
# card for a manually chosen I/O address and IRQ, you must specify
# those parameters here.
#device an
# The probe order of these is presently determined by i386/isa/isa_compat.c.
#device ie0 at isa? disable port 0x300 irq 10 iomem 0xd0000
#device le0 at isa? disable port 0x300 irq 5 iomem 0xd0000
#device lnc0 at isa? disable port 0x280 irq 10 drq 0
#device cs0 at isa? disable port 0x300
#device sn0 at isa? disable port 0x300 irq 10
# Pseudo devices - the number indicates how many units to allocate.
pseudo-device loop # Network loopback
pseudo-device ether # Ethernet support
pseudo-device sl 1 # Kernel SLIP
pseudo-device ppp 1 # Kernel PPP
pseudo-device tun # Packet tunnel.
pseudo-device pty # Pseudo-ttys (telnet etc)
pseudo-device md # Memory "disks"
pseudo-device gif # IPv6 and IPv4 tunneling
pseudo-device faith 1 # IPv6-to-IPv4 relaying (translation)
# The `bpf' pseudo-device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
pseudo-device bpf #Berkeley packet filter
# USB support
device uhci # UHCI PCI->USB interface
device ohci # OHCI PCI->USB interface
device usb # USB Bus (required)
device ugen # Generic
device uhid # "Human Interface Devices"
device ukbd # Keyboard
device ulpt # Printer
device umass # Disks/Mass storage - Requires scbus and da
device ums # Mouse
#device uscanner # Scanners
#device urio # Diamond Rio MP3 Player
# USB Ethernet, requires mii
#device aue # ADMtek USB ethernet
#device cue # CATC USB ethernet
#device kue # Kawasaki LSI USB ethernet
More information about the freebsd-questions
mailing list