i386/132206: mpt driver: system panics on boot when mirroring and 2nd drive is resyncing

Geoffrey Lassner fbsd71p3_mpt_bug at zyni.com
Sat Feb 28 11:20:03 PST 2009


>Number:         132206
>Category:       i386
>Synopsis:       mpt driver: system panics on boot when mirroring and 2nd drive is resyncing
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Feb 28 19:20:02 UTC 2009
>Closed-Date:
>Last-Modified:
>Originator:     Geoffrey Lassner
>Release:        7.1-RELEASE-p3
>Organization:
N/A
>Environment:
FreeBSD 7.1-RELEASE-p3 #0: Fri Feb 27 09:32:23 MST 2009
    root at outil:/usr/src/sys/i386/compile/GENERIC
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 244 (1793.96-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0xf58  Stepping = 8
  Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!>
real memory  = 2146893824 (2047 MB)
avail memory = 2091339776 (1994 MB)

(Power is down for weekend I can not obtain "uname -a" information at this time sorry)
>Description:


System will not finish booting, the dump device is configured but it does not get far enough to generate a core file that I can submit.

I have added these options to GENERIC kernel to try to add some more debugging information:

> options               KDB
> options               DDB
> options               INVARIANTS
> options               INVARIANT_SUPPORT
> options               WITNESS
> options               DEBUG_LOCKS
> options               DEBUG_VFS_LOCKS
> options               DIAGNOSTIC


==============================================================================
Here is boot with the debugging turned on.
==============================================================================

/boot/kernel/acpi.ko text=0x53d58 data=0x23a0+0x186c syms=[0x4+0x8690+0x4+0xb12b]
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-RELEASE-p3 #1: Fri Feb 27 17:10:26 MST 2009
    root at outil:/usr/src/sys/i386/compile/GENERIC_KDB_DDB
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 244 (1793.83-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0xf58  Stepping = 8
  Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!>
real memory  = 2146893824 (2047 MB)
avail memory = 2090770432 (1993 MB)
ACPI APIC Table: <PTLTD          APIC  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
WITNESS: spin lock cpuset not in order list
WITNESS: spin lock intrcnt not in order list
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-23 on motherboard
ioapic1 <Version 1.1> irqs 24-27 on motherboard
ioapic2 <Version 1.1> irqs 28-31 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
acpi0: <PTLTD    XSDT> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
unknown: I/O range not supported
unknown: I/O range not supported
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem 0xd8000-0xdbfff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci1: <ACPI PCI bus> on pcib1
ohci0: <OHCI (generic) USB controller> mem 0xfd120000-0xfd120fff irq 19 at device 0.0 on pci1
ohci0: [GIANT-LOCKED]
ohci0: [ITHREAD]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: <AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 3 ports with 3 removable, self powered
ohci1: <OHCI (generic) USB controller> mem 0xfd121000-0xfd121fff irq 19 at device 0.1 on pci1
ohci1: [GIANT-LOCKED]
ohci1: [ITHREAD]
usb1: OHCI version 1.0, legacy support
usb1: SMM does not respond, resetting
usb1: <OHCI (generic) USB controller> on ohci1
usb1: USB revision 1.0
uhub1: <AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 3 ports with 3 removable, self powered
vgapci0: <VGA-compatible display> mem 0xfe000000-0xfe7fffff,0xfd100000-0xfd11ffff,0xfd800000-0xfdffffff irq 16 at device 5.0 on pci1
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD 8111 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1000-0x100f at device 7.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pci0: <bridge> at device 7.3 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> at device 10.0 on pci0
pci2: <ACPI PCI bus> on pcib2
bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x1002> mem 0xfe800000-0xfe80ffff irq 25 at device 2.0 on pci2
miibus0: <MII bus> on bge0
brgphy0: <BCM5703 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge0: Ethernet address: 00:09:3d:10:af:b9
bge0: [ITHREAD]
bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x1002> mem 0xfe810000-0xfe81ffff irq 26 at device 3.0 on pci2
miibus1: <MII bus> on bge1
brgphy1: <BCM5703 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge1: Ethernet address: 00:09:3d:10:af:ba
bge1: [ITHREAD]
mpt0: <LSILogic 1030 Ultra4 Adapter> port 0x2000-0x20ff mem 0xfe830000-0xfe83ffff,0xfe820000-0xfe82ffff irq 27 at device 4.0 on pci2
mpt0: [ITHREAD]
mpt0: MPI Version=1.2.15.0
mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE )
mpt0: 1 Active Volume (1 Max)
mpt0: 2 Hidden Drive Members (6 Max)
pcib3: <ACPI PCI-PCI bridge> at device 11.0 on pci0
pci3: <ACPI PCI bus> on pcib3
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A, console
sio0: [FILTER]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcefff pnpid ORM0000 on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
mpt0:vol0(mpt0:0:0): Settings ( Hot-Plug-Spares )
mpt0:vol0(mpt0:0:0): Using Spare Pool: 0
mpt0:vol0(mpt0:0:0): 2 Members:
      (mpt0:1:0:0): Primary Online
      (mpt0:1:1:0): Secondary Out of Sync Online
mpt0:vol0(mpt0:0:0): RAID-1 - Degraded
mpt0:vol0(mpt0:0:0): Status ( Enabled Re-Syncing )
mpt0:vol0(mpt0:0:0): Low Priority Re-Sync
mpt0:vol0(mpt0:0:0): 142745967 of 143110144 blocks remaining
(mpt0:vol0:0): Physical (mpt0:0:0:0), Pass-thru (mpt0:1:0:0)
(mpt0:vol0:0): Online
(mpt0:vol0:1): Physical (mpt0:0:1:0), Pass-thru (mpt0:1:1:0)
(mpt0:vol0:1): Online
(mpt0:vol0:1): Status ( Out-Of-Sync )
acd0: CDROM <CD-224E/1.9A> at ata1-master UDMA33
Waiting 5 seconds for SCSI devices to settle
mpt0:vol0(mpt0:0:0): Volume Status Changed
mpt0:vol0(mpt0:0:0): RAID-1 - Degraded
mpt0:vol0(mpt0:0:0): Status ( Enabled Re-Syncing )
mpt0:vol0(mpt0:0:0): Low Priority Re-Sync
mpt0:vol0(mpt0:0:0): 142743855 of 143110144 blocks remaining
mpt0:vol0(mpt0:0:0): RAID-1 - Degraded
mpt0:vol0(mpt0:0:0): Status ( Enabled Re-Syncing )
mpt0:vol0(mpt0:0:0): Low Priority Re-Sync
mpt0:vol0(mpt0:0:0): 142743855 of 143110144 blocks remaining
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex mpt r = 0 (0xc528c004) locked @ cam/cam_xpt.c:7214
KDB: stack backtrace:
db_trace_self_wrapper(c0b3bd3e,e57c3ab8,c07cbd27,c0b3c0fd,e57c3acc,...) at db_trace_self_wrapper+0x26
kdb_backtrace(c0b3c0fd,e57c3acc,4,1,0,...) at kdb_backtrace+0x29
witness_warn(5,0,c0b68cf0,c0b5b67e,c5226000,...) at witness_warn+0x1d7
trap(e57c3b58) at trap+0x122
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc046ccbb, esp = 0xe57c3b98, ebp = 0xe57c3bb0 ---
xpt_done(c52e1c00,c0b76a60,5,5,0,...) at xpt_done+0x1b
xpt_scan_bus(c51d9380,c52e4800,c0ae9ab0,c0bef344,c527d014,...) at xpt_scan_bus+0x3a9
camisr_runqueue(c528c004,0,c0ae9aa7,1c2e,0,...) at camisr_runqueue+0x39f
camisr(0,0,c0b355cb,4b6,c51d9368,...) at camisr+0x11a
ithread_loop(c51d0c50,e57c3d38,c0b3533d,31c,c5226000,...) at ithread_loop+0x1c5
fork_exit(c0770d80,c51d0c50,e57c3d38) at fork_exit+0xb8
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xe57c3d70, ebp = 0 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xdeadc0f2
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc046ccbb
stack pointer           = 0x28:0xe57c3b98
frame pointer           = 0x28:0xe57c3bb0
code segment            = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 19 (swi2: cambio)
[thread pid 19 tid 100018 ]
Stopped at      xpt_done+0x1b:  movl    0x14(%edx),%ebx
db> 
db> show registers
cs                0x20
ds                0x28
es                0x28
fs                 0x8
ss                0x28
eax         0xc52b3ab0
ecx                  0
edx         0xdeadc0de
ebx         0xc52e1c00
esp         0xe57c3b98
ebp         0xe57c3bb0
esi         0xc52e1c00
edi         0xc52e4800
eip         0xc046ccbb  xpt_done+0x1b
efl            0x90202
xpt_done+0x1b:  movl    0x14(%edx),%ebx
db> trace
Tracing pid 19 tid 100018 td 0xc52338c0
xpt_done(c52e1c00,c0b76a60,5,5,0,...) at xpt_done+0x1b
xpt_scan_bus(c51d9380,c52e4800,c0ae9ab0,c0bef344,c527d014,...) at xpt_scan_bus+0x3a9
camisr_runqueue(c528c004,0,c0ae9aa7,1c2e,0,...) at camisr_runqueue+0x39f
camisr(0,0,c0b355cb,4b6,c51d9368,...) at camisr+0x11a
ithread_loop(c51d0c50,e57c3d38,c0b3533d,31c,c5226000,...) at ithread_loop+0x1c5
fork_exit(c0770d80,c51d0c50,e57c3d38) at fork_exit+0xb8
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xe57c3d70, ebp = 0 ---
db> 

==============================================================================

If you need more information please email me.  I have not used gdb/kdb though
so I need the explicit commands/instructions if you are looking for information from that one.

Thanks.
>How-To-Repeat:

Setup mirroring on a Sun Fire V20Z through the LSI utility with the on board
mpt controller.

mpt0: <LSILogic 1030 Ultra4 Adapter> port 0x2000-0x20ff mem 0xfe830000-0xfe83ffff
,0xfe820000-0xfe82ffff irq 27 at device 4.0 on pci2
mpt0: [ITHREAD]
mpt0: MPI Version=1.2.15.0
mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE )
mpt0: 1 Active Volume (1 Max)
mpt0: 2 Hidden Drive Members (6 Max)


While the second drive is resyncing try to boot the system with the Generic kernel.  Mine panics with a similar message like below:

Waiting 5 seconds for SCSI devices to settle
mpt0: Volume(0:0): Volume Status Changed
mpt0: Volume(0:0): Volume Status Changed
mpt0:vol0(mpt0:0:0): Settings ( Hot-Plug-Spares )
mpt0:vol0(mpt0:0:0): Using Spare Pool: 0
mpt0:vol0(mpt0:0:0): 2 Members:
      (mpt0:1:0:0): Primary Online
      (mpt0:1:1:0): Secondary Out of Sync Online
mpt0:vol0(mpt0:0:0): RAID-1 - Degraded
mpt0:vol0(mpt0:0:0): Status ( Enabled Re-Syncing )
mpt0:vol0(mpt0:0:0): Low Priority Re-Sync
mpt0:vol0(mpt0:0:0): 142881270 of 143110144 blocks remaining
(mpt0:vol0:0): Physical (mpt0:0:0:0), Pass-thru (mpt0:1:0:0)
(mpt0:vol0:0): Online
(mpt0:vol0:1): Physical (mpt0:0:1:0), Pass-thru (mpt0:1:1:0)
(mpt0:vol0:1): Online
(mpt0:vol0:1): Status ( Out-Of-Sync )
mpt0:vol0(mpt0:0:0): RAID-1 - Degraded
mpt0:vol0(mpt0:0:0): Status ( Enabled Re-Syncing )
mpt0:vol0(mpt0:0:0): Low Priority Re-Sync
mpt0:vol0(mpt0:0:0): 142881138 of 143110144 blocks remaining


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x14
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc046ae4b
stack pointer           = 0x28:0xe5729b80
frame pointer           = 0x28:0xe5729b9c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 17 (swi2: cambio)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 6s
Cannot dump. No dump device defined.
Automatic reboot in 15 seconds - press a key on the console to abort


>Fix:

Unknown.

Workaround is to leave the system down while disk is syncing.  

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-i386 mailing list