graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode)

Dominic Marks dom at goodforbusiness.co.uk
Wed Jun 29 18:43:17 GMT 2005


On Wednesday 29 June 2005 16:42, Dominic Marks wrote:
> Hello,
>
> I'm trying to use graid3 to create a raid volume from three
> 250GB SATA discs. I can successfully label, format, and mount
> the disc. The problem arises when I try and migrate some data
> on to the new volume. I'm using rsync to do this from over the
> local network, unfortunately this seems to be produce an
> immediate and reproduceable panic (hand copied):
>
> Fatal trap 12: page fault while in kernel mode
>
> fault virtual address = 0xc30f8000
> fault code = supervisor write, page not present
> instruction pointer = 0x8:0xc05e9783
> stack pointer = 0x10:0xd8030c38
> frame pointer = 0x10:0xd8030c80
> code segment = base 0x0, limit 0xfffff type 0x1b
>              = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 617 (g_raid3 raid)
> trap number = 12
> panic: page fault

Having recompiled I can no longer produce the panic. I think
I may have caused it myself, I had forgotten that I had been
tinkering with some values in sys/sys/param.h last week, but
it didn't ring a bell when the system went down. I'd been
running with MAXPHYS and DFLTPHYS at 256 and it seems graid3
does not like one of those paramters being raised, I suspect
its DFLTPHYS and that perhaps graid3 depends on its value
for some calculations. This is pure speculation.

My apologies for the incorrect report.

> Other programs (touch, ls, diskinfo, etc) do not seem to provoke
> the panic, but rsync will kill the system within a second.
>
> I got a dump (once), but I think it is corrupt in some way
> because I have not been able to get a backtrace or any other
> useful data from it.
>
> # kgdb kernel.debug /usr/crash/vmcore.0
> kgdb: kvm_read: invalid address (f9)
>
> (This line is printed again, and again, and again ...)
>
> This may be because I compiled my debugging kernel after I had
> installed the system, although it should have been an identical
> source tree ... I'm currently rebuilding the system to
> the freshest available -STABLE in the hope that may give a
> full backtrace.
>
> FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0
> Mon Jun 27 09:34:02 BST 2005
> root at mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386
>
> The only thing slightly odd about the machine is that each
> disc is one its own SATA controller. One disc is attached to an
> Intel ICH6 the other two are attached two Silicon Image (3112)
> based cards. The root device is ad2, since the additional cards
> have pushed themselves to the front. This is a temporary setup
> to facilitate migration of data from system to system.
>
> If I can do anything to help track the problem down, please say.
> I really want this to work, and I have some time in which to run
> tests.
>
> * A side note, I have noticed that the panic is often accompanied by
> a ATA DMA timeout (ad1). Could this cause the panic to occur?
>
> Copyright (c) 1992-2005 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
> 1994 The Regents of the University of California. All rights
> reserved. FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005
>     root at mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV
> WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant
> WARNING: MPSAFE network stack disabled, expect reduced performance.
> ACPI APIC Table: <DELL   PESC420>
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf41  Stepping = 1
>
> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,
>PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB
>E> real memory  = 526958592 (502 MB)
> avail memory = 509628416 (486 MB)
> ioapic0: Changing APIC ID to 8
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> lapic0: Forcing LINT1 to edge trigger
> npx0: <math processor> on motherboard
> npx0: INT 16 interface
> acpi0: <DELL PESC420> on motherboard
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0: <ACPI CPU> on acpi0
> acpi_button0: <Power Button> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pci0: <display, VGA> at device 2.0 (no driver attached)
> pcib2: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> bge0: <Broadcom BCM5751 Gigabit Ethernet, ASIC rev. 0x4001> mem
> 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
> 1000baseTX-FDX, auto
> bge0: Ethernet address: 00:11:11:c3:2c:91
> pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
> pci3: <ACPI PCI bus> on pcib3
> pci0: <serial bus, USB> at device 29.0 (no driver attached)
> pci0: <serial bus, USB> at device 29.1 (no driver attached)
> pci0: <serial bus, USB> at device 29.2 (no driver attached)
> pci0: <serial bus, USB> at device 29.3 (no driver attached)
> pci0: <serial bus, USB> at device 29.7 (no driver attached)
> pcib4: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci4: <ACPI PCI bus> on pcib4
> atapci0: <SiI 3112 SATA150 controller> port
> 0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7
> mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4
> ata2: channel #0 on atapci0
> ata3: channel #1 on atapci0
> atapci1: <SiI 3112 SATA150 controller> port
> 0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7
> mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4
> ata4: channel #0 on atapci1
> ata5: channel #1 on atapci1
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci2: <Intel ICH6 UDMA100 controller> port
> 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device
> 31.1 on pci0
> ata0: channel #0 on atapci2
> ata1: channel #1 on atapci2
> atapci3: <Intel ICH6 SATA150 controller> port
> 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07
> irq 20 at device 31.2 on pci0
> ata6: channel #0 on atapci3
> ata7: channel #1 on atapci3
> ichsmb0: <SMBus controller> port 0xece0-0xecff irq 17 at device 31.3
> on pci0
> atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10
> on acpi0
> sio0: type 16550A
> pmtimer0 on isa0
> orm0: <ISA Option ROMs> at iomem
> 0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on
> isa0 sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
> isa0
> ppc0: parallel port not found.
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> Timecounter "TSC" frequency 2527010839 Hz quality 800
> Timecounters tick every 1.250 msec
> IPsec: Initialized Security Association Processing.
> ad0: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at
> ata4-master SATA150
> ad1: 238475MB <WDC WD2500JD-22HBB0/08.02D08> [484521/16/63] at
> ata5-master SATA150
> ad2: 76319MB <WDC WD800JD-60JRA0/05.01C05> [155061/16/63] at
> ata6-master SATA150
> ad3: 238475MB <WDC WD2500JD-00HBB0/08.02D08> [484521/16/63] at
> ata7-master SATA150
> Mounting root from ufs:/dev/ad2s1a
> WARNING: / was not properly dismounted
> WARNING: /usr was not properly dismounted
> WARNING: /var was not properly dismounted
>
> Thanks very much,

-- 
Dominic
GoodforBusiness.co.uk
I.T. Services for SMEs in the UK.


More information about the freebsd-stable mailing list