ATA DMA Issues Resurfaced (READ_DMA TIMEOUT/FAILURE)
Kendall Gifford
zettabyte at gmail.com
Thu Apr 21 11:30:35 PDT 2005
Howdy. I'm not sure whether hardware or stable is the best list for
this, but here is my problem. Any info, recommendations, or help will
be greatly appreciated.
I've got a server running 5-STABLE (updated/built Jan. 22, 2005). It
has been running this kernel, a 5.3-RELEASE kernel, and other 5.x
branch versions for the last ten or so months now. Previous to this,
it was running 4.9-RELEASE.
About ten months ago, when I switched from the 4.x branch to the 5.x
branch, I immediately began experiencing WRITE_DMA ICRC errors durring
disk activity at seemingly random times. At that time I posted to this
list and questions the following message:
http://groups-beta.google.com/group/mailing.freebsd.questions/browse_thread/thread/17fe5871d823f380/a16568320427152e?rnum=2#a16568320427152e
The gist of the message and my current experience is that my hardware
(drives, cables, motherboard controllers, etc.) is definately fine and
that I've noticed others posting various, possibly-related issues both
before and since I posted the above message. I basically ended up
working around the problem by running atacontrol in a
/usr/local/etc/rc.d/ script that set my drives to PIO4 mode. I then
mostly forgot about the problem as everything has since worked
fine--that is until just recently.
About a week ago (around April 14, 2005) after performing some updates
of some ports and configurations, I decided to perform a reboot (quite
extranous, I know, but reassuring to verify that all scripts/configs
are properly set up the way I want). Just as my system began starting
local services, and just after it ran my custom /usr/local/etc/rc.d
atacontrol script, I got the following error messages:
<Screenshot>
Master = PIO4
Slave = UDMA33
Master = PIO4
Slave = BIOSPIO
ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=146793208
ad0: FAILURE - READ_DMA timed out
GEOM_VINUM: subdisk raid.p0.s0 is down
GEOM_VINUM: plex raid.p0 is down
Starting mysql.
Fatal trap 12: page fault while in kernel mode
fault virtual addess = 0xc
fault code = supervisor read, page not present
instruction pointer = 0x8:0xc04ba88f
stack pointer = 0x10:0xd321dc6c
frame pointer = 0x10:0xd321dc98
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL0, pres 1, def32 1, gran1
processor eflags = interrupt enabled, resume, IOPL= 0
current process = 4 (g_down)
trap number = 12
panic: page fault
Uptime: 28s
</End Screenshot>
This is the first time in ten months I've had issues switching to PIO4
mode during local service startup. I really am not quite sure what
happened.
Anyhow, I've since rebooted into single-user mode, brought my
gvinum-mirror plex back up, and the usual stuff to manually bring my
system up. But, I did have one attempt at doing this when I foolishly
forgot to manually atacontrol my drives before trying to bring my
gvinum plex back up. As it was restoring in the background, I
remembered and unthinkingly ran atacontrol and again succeeded in
bringing my system down in much the same manner as shown above (only
this time with WRITE_DMA errors instead of READ_DMA errors).
Anyhow, based on this experience, my two guesses as to the cause of my
booting problem is that disk activity from starting the system is
causing problems before my disks can be put fully in PIO4 mode (and
timing is immaculate) or that the current state of things when
atacontrol is executed causes problems. As you can see, I have no idea
what the real problem is and wonder if any more info on this/these
ata/dma problems is available. I wonder if I'd be better off moving to
4.11 until the root cause of these problems is found.
Any help or information anyone?
System Info:
<Kernel Config>
machine i386
cpu I686_CPU
device npx
device isa
device pci
device agp
options VESA
ident KERNEL
maxusers 100
options SCHED_4BSD
options COMPAT_43
options COMPAT_FREEBSD4
options SYSVSHM
options SYSVSEM
options SYSVMSG
options KTRACE
options INVARIANT_SUPPORT
options INET
device ether
device loop
device bpf
device tun
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_VERBOSE_LIMIT=1000
options IPDIVERT
options FFS
options NFSCLIENT
options NFSSERVER
options CD9660
options FDESCFS
options MSDOSFS
options NTFS
options NULLFS
options PROCFS
options PSEUDOFS
options UDF
options SOFTUPDATES
options UFS_EXTATTR
options UFS_EXTATTR_AUTOSTART
options UFS_ACL
options GEOM_BSD
options GEOM_CONCAT
options GEOM_GPT
options GEOM_LABEL
options GEOM_MBR
options GEOM_MIRROR
options GEOM_VOL
options QUOTA
device md
device random
device pty
device snp
options _KPOSIX_PRIORITY_SCHEDULING
device atkbdc
device atkbd
device psm
device vga
device splash
device sc
options MAXCONS=16
options SC_HISTORY_SIZE=2000
options SC_TWOBUTTON_MOUSE
options SC_KERNEL_CONS_ATTR=(FG_RED|BG_BLACK)
options SC_KERNEL_CONS_REV_ATTR=(FG_BLACK|BG_RED)
device ata
device atadisk
device ataraid
device atapicd
device atapifd
device atapist
options ATA_STATIC_ID
device fdc
device sio
device ppc
device ppbus
device lpt
device ppi
device pmtimer
device mem
device apic
device io
device miibus
device vr
device uhci
device ohci
device usb
device ucom
device ugen
device uhid
device ukbd
device ulpt
device ums
device uscanner
</End Kernel Config>
<Device Hints>
hint.atkbdc.0.at="isa"
hint.atkbdc.0.port="0x060"
hint.atkbd.0.at="atkbdc"
hint.atkbd.0.irq="1"
hint.atkbd.0.flags="0x1"
hint.psm.0.at="atkbdc"
hint.psm.0.irq="12"
hint.vga.0.at="isa"
hint.sc.0.at="isa"
hint.sc.0.flags="0x100"
hint.fdc.0.at="isa"
hint.fdc.0.port="0x3f0"
hint.fdc.0.irq="6"
hint.fdc.0.drq="2"
hint.fd.0.at="fdc0"
hint.fd.0.drive="0"
hint.fd.1.at="fdc0"
hint.fd.1.drive="1"
hint.sio.0.at="isa"
hint.sio.0.port="0x3f8"
hint.sio.0.flags="0x10"
hint.sio.0.irq="4"
hint.sio.1.at="isa"
hint.sio.1.port="0x2f8"
hint.sio.1.irq="3"
hint.ppc.0.at="isa"
hint.ppc.0.irq="7"
</End Device Hints>
<Dmesg>
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #0: Sat Jan 22 19:54:10 MST 2005
root at name.domain.tld:/usr/obj/usr/src/sys/KERNEL
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Duron(tm) processor (1297.79-MHz 686-class CPU)
Origin = "AuthenticAMD" Id = 0x671 Stepping = 1
Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
AMD Features=0xc0400000<AMIE,DSP,3DNow!>
real memory = 536870912 (512 MB)
avail memory = 519913472 (495 MB)
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> pcibus 0 on motherboard
pir0: <PCI Interrupt Routing Table: 9 Entries> on motherboard
pci0: <PCI bus> on pcib0
agp0: <VIA Generic host to PCI bridge> mem 0xe0000000-0xe7ffffff at
device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci0: <display, VGA> at device 8.0 (no driver attached)
uhci0: <VIA 83C572 USB controller> port 0xd000-0xd01f irq 11 at device
16.0 on pci0
uhci0: [GIANT-LOCKED]
usb0: <VIA 83C572 USB controller> on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xd400-0xd41f irq 3 at device
16.1 on pci0
uhci1: [GIANT-LOCKED]
usb1: <VIA 83C572 USB controller> on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: <VIA 83C572 USB controller> port 0xd800-0xd81f irq 10 at device
16.2 on pci0
uhci2: [GIANT-LOCKED]
usb2: <VIA 83C572 USB controller> on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pci0: <serial bus, USB> at device 16.3 (no driver attached)
isab0: <PCI-ISA bridge> at device 17.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <VIA 8235 UDMA133 controller> port
0xdc00-0xdc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 17.1 on
pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pci0: <multimedia, audio> at device 17.5 (no driver attached)
vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe800-0xe8ff mem
0xed001000-0xed0010ff irq 11 at device 18.0 on pci0
miibus0: <MII bus> on vr0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vr0: Ethernet address: 00:0d:87:00:bf:1d
cpu0 on motherboard
orm0: <ISA Option ROMs> at iomem 0xc8000-0xcffff,0xc0000-0xc7fff on isa0
pmtimer0 on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: [FAST]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppbus0: <Parallel port bus> on ppc0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
unknown: <PNP0303> can't assign resources (port)
unknown: <PNP0c02> can't assign resources (memory)
unknown: <PNP0f13> can't assign resources (irq)
unknown: <PNP0501> can't assign resources (port)
unknown: <PNP0700> can't assign resources (port)
unknown: <PNP0401> can't assign resources (port)
Timecounter "TSC" frequency 1297789521 Hz quality 800
Timecounters tick every 10.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding disabled,
default to deny, logging limited to 1000 packets/entry by default
ad0: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata0-master UDMA133
acd0: CDRW <LITE-ON LTR-48246S/SS08> at ata0-slave UDMA33
ad2: 117246MB <Maxtor 6Y120P0/YAR41VW0> [238216/16/63] at ata1-master UDMA133
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
GEOM_VINUM: subdisk raid.p1.s0 is up
GEOM_VINUM: subdisk raid.p0.s0 is stale
GEOM_VINUM: plex sync raid.p1 -> raid.p0 started
GEOM_VINUM: sd raid.p0.s0 is initializing
GEOM_VINUM: plex raid.p0 is degraded
GEOM_VINUM: plex raid.p0 is up
GEOM_VINUM: plex sync raid.p1 -> raid.p0 finished
</End Dmesg>
--
Kendall Gifford
More information about the freebsd-hardware
mailing list