(2) 5.1-R-p2 crashes on SMP with AMI RAID and Intel 1000/Pro
Colin Faber
cfaber at fpsn.net
Wed Aug 20 11:18:36 PDT 2003
Hi,
I've got nearly the same setup in a Dell 1600SC with a gig of ram and a PERC4/Sc (LSI MegaRAID) card.
Dual 2.4GHz Xeon P4 HT CPU's and I've discovered I can lock up FreeBSD 5.1-RELEASE-p2 on command
simply by running something to quickly create and remove a directory. i.e.:
perl -e 'for(my $i = 0 ; $i < 9999; $i++){ mkdir("abc"); rmdir("abc"); }'
Having machdep.cpu_idle_hlt = 0 makes no difference.
Kernel:
FreeBSD 5.1-RELEASE-p2 FreeBSD 5.1-RELEASE-p2 #0: Mon Aug 11 21:40:47 MDT 2003 i386
Raid:
amr0: <LSILogic MegaRAID> mem 0xfcd00000-0xfcd0ffff irq 3 at device 2.0 on pci1
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 34556MB (70770688 sectors) RAID 5 (optimal)
I suspect that your and my problems are more driver related to the amr driver and may be exposing
some other problem with in the kernels fs locking. I don't think (as others have suggested) that
your issue is power related, or related to the combination of hardware you're using. (Other than
the fact that you've got a MegaRAID card).
The exact crash message I'm seeing is:
panic: lockmgr: locking against myself
cpuid = 0; lapic.id 00000000
boot() called on cpu#0
syncing disks, buffers remaining... panic: ffs_copyonwrite: recursive call
cpuid = 0; lapic.id 00000000
boot() called on cpu#0
Uptime: 58s
pfs_vncache_unload(): 7 entries remaining
amr0: flushing cache...done
Terminate ACPI
Hartmann, O. wrote:
> Dear Sirs.
>
> It seems to me a never ending story. We run a box with a TYAN Thunder
> 2500 Dual SMP mainboard, 2GB ECC Tyan certified memory, AMI Enterprise
> 1600 RAID adapter and additional Intel 1000/Pro server type (64 bit)
> GBit LAN NIC. With FreeBSD 4.8 this was stable, but to achive this
> state was really hard! It is a story similar to that what happend when
> we changed towards FreeBSD 5.1-RELEASE-p2 on this machine.
>
> It seems to be highly dependend in which PCI slot several cards are
> attached, so I will report this here also.
>
> Phenomenon:
>
> After a while the machine was running, the SMP kernel reboots
> spontanously. This is when heavy IO is done, compiling or, when in the
> morning time our department gets up and our staff connects to the samba
> server.
>
> Dependend on which devices are switched on or off by BIOS, the kernel
> freezes at the stage when the amr0 RAID got recognized. I can avoid this
> by enabling the built in NIC (fxp0). I can force this by putting the em0
> NIC into another slot, for instance in the one remaining 64BIT/66MHz
> slot (which should be a separate bus).
>
> This 'game' was identical to that I had with FreeBSD 4.X - 4.8 and I
> found out, that putting an additional NIC into PCI slot No. 2 (counted
> from AGP slot on) made things clear, but using both NICs together
> (either additional fxp0 or the new em0) remains the systems completely
> unstable.
>
> In FreeBSD 5.1-RELEASE-p2 and especially in FreeBSD 5.1-CURRENT this
> 'gambling' seems to reach its climax. My kernel is built up with
> SCHED_4BSD because SCHED_ULE and ADAPTIVE_MUTEXES crashes immediately
> the same way as described (running a while, then coredumping or freeze
> at the stage after the amr0-RAID showed up in the kernel boot messages,
> see the dmesg output below).
>
> I'm not an hardware expert, but all this wierd stuff looks like to me to be
> a IRQ routing problem. I fiddled around with many hand-assigned IRQ configurations,
> but nothing helped. Either the Intel 1000/Pro or the AMI RAID causing
> problems in the TYAN Thunder 2500 SMP environment.
>
> We have also a SMP machine with a similar hardware, based on an ASUS CV4X-D,
> AMI Elite 1600 RAID controller and the same Intel em0 1GBit NIC. OS is
> FreeBSD 4.8 and this system never had any problem!
>
> I feel a little bit helpless this moment, because I think I tried every trick
> and something seems to be wrong with the combination TYAN Thunder 2500 and FreeBSD
> 5.X SMP. It is also very courios, that a kernel without SMP/IO_APIC freezes after
> booting at the same place (amr0 RAID recognition).
>
> Is there any help outside?
>
> I attach the kernel config file and the dmesg output. Please note: I disabled both
> serial ports, the parallel port, sound and usb to get additional IRQs. But I have to
> enable the built in NIC to get a bootable, but instable FreeBSD 5.1-R box.
>
> ====================================
> DMESG output
> ====================================
>
> Copyright (c) 1992-2003 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD 5.1-RELEASE-p2 #14: Wed Aug 13 09:47:00 CEST 2003
> root at atmos.physik.uni-mainz.de:/usr/obj/usr/src/sys/ATMOS
> Preloaded elf kernel "/boot/kernel/kernel" at 0xc0458000.
> Timecounter "i8254" frequency 1193182 Hz
> Timecounter "TSC" frequency 868644793 Hz
> CPU: Intel Pentium III (868.64-MHz 686-class CPU)
> Origin = "GenuineIntel" Id = 0x683 Stepping = 3
> Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
> real memory = 2147483648 (2048 MB)
> avail memory = 2085625856 (1989 MB)
> Programming 16 pins in IOAPIC #0
> IOAPIC #0 intpin 2 -> irq 0
> Programming 16 pins in IOAPIC #1
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000
> cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000
> io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000
> io1 (APIC): apic id: 3, version: 0x000f0011, at 0xfec01000
> netsmb_dev: loaded
> Pentium Pro MTRR support enabled
> npx0: <math processor> on motherboard
> npx0: INT 16 interface
> pcibios: BIOS version 2.10
> Using $PIR table, 12 entries at 0xc00fdf00
> pcib0: <Host to PCI bridge> at pcibus 0 on motherboard
> pci0: <PCI bus> on pcib0
> IOAPIC #1 intpin 13 -> irq 2
> IOAPIC #1 intpin 12 -> irq 16
> IOAPIC #1 intpin 2 -> irq 17
> IOAPIC #1 intpin 7 -> irq 18
> pcib1: <PCI-PCI bridge> at device 0.1 on pci0
> pci1: <PCI bus> on pcib1
> IOAPIC #1 intpin 1 -> irq 19
> pci1: <display, VGA> at device 0.0 (no driver attached)
> sym0: <896> port 0xf800-0xf8ff mem 0xfeafe000-0xfeafffff,0xfeafac00-0xfeafafff irq 2 at device 1.0 on pci0
> sym0: Symbios NVRAM, ID 7, Fast-40, SE, parity checking
> sym0: open drain IRQ line driver, using on-chip SRAM
> sym0: using LOAD/STORE-based firmware.
> sym0: handling phase mismatch from SCRIPTS.
> sym1: <896> port 0xf400-0xf4ff mem 0xfeafc000-0xfeafdfff,0xfeafa800-0xfeafabff irq 16 at device 1.1 on pci0
> sym1: Symbios NVRAM, ID 7, Fast-40, LVD, parity checking
> sym1: open drain IRQ line driver, using on-chip SRAM
> sym1: using LOAD/STORE-based firmware.
> sym1: handling phase mismatch from SCRIPTS.
> em0: <Intel(R) PRO/1000 Network Connection, Version - 1.5.31> port 0xfcc0-0xfcff mem 0xfeac0000-0xfeadffff irq 17 at device 4.0 on pci0
> em0: Speed:1000 Mbps Duplex:Full
> fxp0: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port 0xfc40-0xfc7f mem 0xfe900000-0xfe9fffff,0xfeaf9000-0xfeaf9fff irq 18 at device 7.0 on pci0
> fxp0: Ethernet address 00:e0:81:00:f0:d7
> miibus0: <MII bus> on fxp0
> inphy0: <i82555 10/100 media interface> on miibus0
> inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> isab0: <PCI-ISA bridge> port 0x500-0x50f at device 15.0 on pci0
> isa0: <ISA bus> on isab0
> pci0: <mass storage, ATA> at device 15.1 (no driver attached)
> pcib2: <ServerWorks host to PCI bridge> at pcibus 2 on motherboard
> pci2: <PCI bus> on pcib2
> pcib3: <PCI-PCI bridge> at device 2.0 on pci2
> pci3: <PCI bus> on pcib3
> IOAPIC #1 intpin 11 -> irq 20
> IOAPIC #1 intpin 8 -> irq 21
> pcib4: <PCI-PCI bridge> at device 0.0 on pci3
> pci4: <PCI bus> on pcib4
> IOAPIC #1 intpin 10 -> irq 22
> amr0: <LSILogic MegaRAID> mem 0xf0000000-0xf3ffffff irq 22 at device 0.0 on pci4
> amr0: <LSILogic MegaRAID Enterprise 1600> Firmware G170, BIOS F316, 64MB RAM
> pci3: <mass storage, SCSI> at device 1.0 (no driver attached)
> pci3: <mass storage, SCSI> at device 2.0 (no driver attached)
> orm0: <Option ROMs> at iomem 0xca000-0xcdfff,0xc0000-0xc9fff on isa0
> fdc0: <Enhanced floppy controller (i82077, NE72065 or clone)> at port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
> fdc0: FIFO enabled, 8 bytes threshold
> fd0: <1440-KB 3.5" drive> on fdc0 drive 0
> atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
> atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
> kbd0 at atkbd0
> psm0: <PS/2 Mouse> irq 12 on atkbdc0
> psm0: model IntelliMouse, device ID 3
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <8 virtual consoles, flags=0x300>
> sio0: configured irq 4 not in bitmap of probed irqs 0
> sio0: port may not be enabled
> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> sio0: type 8250 or not responding
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> ppc0: parallel port not found.
> unknown: <PNP0303> can't assign resources (port)
> psmcpnp0: irq resource info is missing; assuming irq 12
> unknown: <PNP0700> can't assign resources (port)
> ppc1: parallel port not found.
> APIC_IO: Testing 8254 interrupt delivery
> APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2
> APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
> Timecounters tick every 1.000 msec
> ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, logging unlimited
> DUMMYNET initialized (011031)
> Waiting 5 seconds for SCSI devices to settle
> (noperiph:sym0:0:-1:-1): SCSI BUS reset delivered.
> (noperiph:sym1:0:-1:-1): SCSI BUS reset delivered.
> amrd0: <LSILogic MegaRAID logical drive> on amr0
> amrd0: 245014MB (501788672 sectors) RAID 5 (optimal)
>
> ===> freezing here!
>
> sa0 at sym1 bus 0 target 5 lun 0
> sa0: <HP C5713A H910> Removable Sequential Access SCSI-2 device
> sa0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
> ch0 at sym1 bus 0 target 5 lun 1
> ch0: <HP C5713A H910> Removable Changer SCSI-2 device
> ch0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit)
> ch0: 6 slots, 1 drive, 0 pickers, 0 portals
> SMP: AP CPU #1 Launched!
> Mounting root from ufs:/dev/amrd0s1a
> cd0 at sym0 bus 0 target 3 lun 0
> cd0: <TEAC CD-ROM CD-532S 1.0A> Removable CD-ROM SCSI-2 device
> cd0: 20.000MB/s transfers (20.000MHz, offset 16)
> cd0: Attempt to query device size failed: NOT READY, Medium not present
>
> ========================
> KERNEL config file
> ========================
>
> machine i386
> cpu I686_CPU
> ident ATMOS
>
> options SMP # Symmetric MultiProcessor Kernel
> options APIC_IO # Symmetric (APIC) I/O
>
> maxusers 0
>
> hints "ATMOS.hints" #Default places to look for devices.
>
>
> #options COMPAT_FREEBSD4
> options SCHED_4BSD #4BSD scheduler
>
> #options SCHED_ULE
> #options ADAPTIVE_MUTEXES
>
> #options PQ_CACHESIZE=256
>
> options CPU_ENABLE_SSE
>
> options CLK_USE_TSC_CALIBRATION
> #options HZ=1000
>
> #makeoptions CONF_CFLAGS=-fno-builtin
> #options MAXDSIZ=(1024UL*1024*1024)
> #options MAXSSIZ=(128UL*1024*1024)
> #options DFLDSIZ=(1024UL*1024*1024)
>
> options GEOM_AES
> options GEOM_APPLE
> options GEOM_BDE
> options GEOM_BSD
> options GEOM_GPT
> options GEOM_MBR
> options GEOM_PC98
> options GEOM_SUNLABEL
> options GEOM_VOL
>
> options ROOTDEVNAME=\"ufs:amrd0s1a\"
>
> options INET #InterNETworking
> #options INET6 #IPv6 communications protocols
> options FFS #Berkeley Fast Filesystem
> options SOFTUPDATES #Enable FFS soft updates support
> options UFS_ACL #Support for access control lists
> options UFS_DIRHASH #Improve performance on big directories
> options NFSCLIENT #Network Filesystem Client
> options NFSSERVER #Network Filesystem Server
> options MSDOSFS #MSDOS Filesystem
> options CD9660 #ISO 9660 Filesystem
> options PROCFS #Process filesystem (requires PSEUDOFS)
> options PSEUDOFS #Pseudo-filesystem framework
> options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!]
> options SCSI_DELAY=5000 #Delay (in ms) before probing SCSI
>
> options SYSVSHM #SYSV-style shared memory
> options SYSVMSG #SYSV-style message queues
> options SYSVSEM #SYSV-style semaphores
>
> options NETSMB
> options NETSMBCRYPTO
> options LIBMCHAIN
> options LIBICONV
>
> #options WATCHDOG
>
> options NETGRAPH
> #options NETGRAPH_ASYNC
> #options NETGRAPH_BPF
> #options NETGRAPH_BRIDGE
> #options NETGRAPH_CISCO
> #options NETGRAPH_ECHO
> #options NETGRAPH_ETHER
> #options NETGRAPH_FRAME_RELAY
> #options NETGRAPH_GIF
> #options NETGRAPH_GIF_DEMUX
> #options NETGRAPH_HOLE
> #options NETGRAPH_IFACE
> #options NETGRAPH_IP_INPUT
> #options NETGRAPH_KSOCKET
> #options NETGRAPH_L2TP
> #options NETGRAPH_LMI
> #options NETGRAPH_MPPC_ENCRYPTION
> #options NETGRAPH_ONE2MANY
> #options NETGRAPH_PPP
> #options NETGRAPH_PPPOE
> #options NETGRAPH_PPTPGRE
> #options NETGRAPH_RFC1490
> #options NETGRAPH_SOCKET
> #options NETGRAPH_SPLIT
> #options NETGRAPH_TEE
> #options NETGRAPH_TTY
> #options NETGRAPH_UI
> #options NETGRAPH_VJC
>
> options MROUTING
> options IPFIREWALL
> options IPFIREWALL_VERBOSE
> options IPFIREWALL_FORWARD
> #options IPFIREWALL_VERBOSE_LIMIT=100
> #options IPFIREWALL_DEFAULT_TO_ACCEPT
> #options IPV6FIREWALL
> #options IPV6FIREWALL_VERBOSE
> #options IPV6FIREWALL_VERBOSE_LIMIT=100
> #options IPV6FIREWALL_DEFAULT_TO_ACCEPT
> options IPDIVERT
> #options IPFILTER
> #options IPFILTER_LOG
> #options IPFILTER_DEFAULT_BLOCK
> options IPSTEALTH
>
> options RANDOM_IP_ID
>
> options ACCEPT_FILTER_DATA
> #options ACCEPT_FILTER_HTTP
>
> options TCP_DROP_SYNFIN
> options DUMMYNET
> #options BRIDGE
>
> options QUOTA
>
> options _KPOSIX_PRIORITY_SCHEDULING
> options P1003_1B_SEMAPHORES
>
> #options MAC
> #options MAC_BIBA
> #options MAC_BSDEXTENDED
> #options MAC_DEBUG
> #options MAC_IFOFF
> #options MAC_LOMAC
> #options MAC_MLS
> #options MAC_NONE
> #options MAC_PARTITION
> #options MAC_SEEOTHERUIDS
> #options MAC_TEST
>
> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
>
> device isa
> #options AUTO_EOI_1
>
> device pci
>
> device agp
>
> # Floppy drives
> device fdc
>
> # SCSI Controllers
> device sym # NCR/Symbios Logic (newer chipsets + those of `ncr')
> #device ahc
>
> # SCSI peripherals
> device scbus # SCSI bus (required)
> device ch # SCSI media changers
> device da # Direct Access (disks)
> device sa # Sequential Access (tape etc)
> device cd # CD
> device pass # Passthrough device (direct SCSI access)
> device ses # SCSI Environmental Services (and SAF-TE)
>
>
> # RAID controllers
> device amr # AMI MegaRAID
>
>
> #options CHANGER_MIN_BUSY_SECONDS=2
> #options CHANGER_MAX_BUSY_SECONDS=10
>
> #options SA_IO_TIMEOUT=4
> #options SA_SPACE_TIMEOUT=60
> #options SA_REWIND_TIMEOUT=(2*60)
> #options SA_ERASE_TIMEOUT=(4*60)
> #options SA_1FM_AT_EOD
>
> #options SCSI_PT_DEFAULT_TIMEOUT=60
> options SES_ENABLE_PASSTHROUGH
>
>
> # atkbdc0 controls both the keyboard and the PS/2 mouse
> device atkbdc # AT keyboard controller
> device atkbd # AT keyboard
> options ATKBD_DFLT_KEYMAP
> makeoptions ATKBD_DFLT_KEYMAP=us.iso
>
> device psm # PS/2 mouse
>
> device vga # VGA video card driver
>
> device splash # Splash screen and screen saver support
>
> # syscons is the default console driver, resembling an SCO console
> device sc
> options MAXCONS=8
>
> #options SC_ALT_MOUSE_IMAGE
> options SC_DFLT_FONT
> makeoptions SC_DFLT_FONT=cp850
>
> options SC_DISABLE_DDBKEY
> options SC_DISABLE_REBOOT
> options SC_HISTORY_SIZE=512
> #options SC_MOUSE_CHAR=0x3
> options SC_PIXEL_MODE
> options SC_NORM_ATTR=(FG_GREEN|BG_BLACK)
> options SC_NORM_REV_ATTR=(FG_YELLOW|BG_GREEN)
> options SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK)
> options SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED)
> #options SC_CUT_SPACES2TABS
> #options SC_CUT_SEPCHARS=\"x09\"
> #options SC_TWOBUTTON_MOUSE
> #options SC_NO_CUTPASTE
> #options SC_NO_FONT_LOADING
> #options SC_NO_HISTORY
> #options SC_NO_SYSMOUSE
> #options SC_NO_SUSPEND_VTYSWITCH
>
> device npx
>
> #device pmtimer
>
> #device sio # 8250, 16[45]50 based serial ports
>
> # Parallel port
> #device ppc
> #device ppbus # Parallel port bus (required)
> #device lpt # Printer
> #device plip # TCP/IP over parallel
> #device ppi # Parallel port interface device
> #device vpo # Requires scbus and da
>
>
> device miibus # MII bus support
> device em
> #device fxp # Intel EtherExpress PRO/100B (82557, 82558)
>
> device random # Entropy device
> device loop # Network loopback
> device ether # Ethernet support
> #device tun # Packet tunnel.
> device pty # Pseudo-ttys (telnet etc)
> #device gif # IPv6 and IPv4 tunneling
> #device faith # IPv6-to-IPv4 relaying (translation)
>
> device bpf # Berkeley packet filter
>
>
> ------------------
>
>
> Thanks a lot for your help,
>
> Oliver
> --
> MfG
> O. Hartmann
>
> ohartman at mail.physik.uni-mainz.de
> ------------------------------------------------------------------
> Systemadministration des Institutes fuer Physik der Atmosphaere (IPA)
> ------------------------------------------------------------------
> Johannes Gutenberg Universitaet Mainz
> Becherweg 21
> 55099 Mainz
>
> Tel: +496131/3924662 (Maschinenraum)
> Tel: +496131/3924144 (Buero)
> FAX: +496131/3923532
> _______________________________________________
> freebsd-smp at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-smp
> To unsubscribe, send any mail to "freebsd-smp-unsubscribe at freebsd.org"
>
>
More information about the freebsd-stable
mailing list