kern/119140: Kernel panic with sata drive and dma problem

Michael Haro mharo at FreeBSD.org
Sat Dec 29 11:30:02 PST 2007


>Number:         119140
>Category:       kern
>Synopsis:       Kernel panic with sata drive and dma problem
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Dec 29 19:30:01 UTC 2007
>Closed-Date:
>Last-Modified:
>Originator:     Michael Haro
>Release:        FreeBSD 7.0-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD zfsserver.mtv.bitsurf.net 7.0-PRERELEASE FreeBSD 7.0-PRERELEASE #4: Sun Dec 23 16:46:49 PST 2007 root at zfsserver.mtv.bitsurf.net:/usr/obj/usr/src/sys/KERNEL i386

dmesg:
Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-PRERELEASE #4: Sun Dec 23 16:46:49 PST 2007
    root at zfsserver.mtv.bitsurf.net:/usr/obj/usr/src/sys/KERNEL
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP 2000+ (1659.61-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x680  Stepping = 0
  Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD Features=0xc0400800<SYSCALL,MMX+,3DNow!+,3DNow!>
real memory  = 1073676288 (1023 MB)
avail memory = 1036881920 (988 MB)
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
hptrr: HPT RocketRAID controller driver v1.1 (Dec 23 2007 16:46:09)
acpi0: <AMIINT SiS735XX> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <SiS 735 host to AGP bridge> on hostb0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
isab0: <PCI-ISA bridge> at device 2.0 on pci0
isa0: <ISA bus> on isab0
ohci0: <SiS 5571 USB controller> mem 0xcfbdd000-0xcfbddfff irq 12 at device 2.2 on pci0
ohci0: [GIANT-LOCKED]
ohci0: [ITHREAD]
usb0: OHCI version 1.0, legacy support
usb0: <SiS 5571 USB controller> on ohci0
usb0: USB revision 1.0
uhub0: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 3 ports with 3 removable, self powered
ohci1: <SiS 5571 USB controller> mem 0xcfbde000-0xcfbdefff irq 10 at device 2.3 on pci0
ohci1: [GIANT-LOCKED]
ohci1: [ITHREAD]
usb1: OHCI version 1.0, legacy support
usb1: <SiS 5571 USB controller> on ohci1
usb1: USB revision 1.0
uhub1: <SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 3 ports with 3 removable, self powered
atapci0: <SiS 735 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 2.5 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
pci0: <multimedia, audio> at device 2.7 (no driver attached)
sis0: <SiS 900 10/100BaseTX> port 0xc800-0xc8ff mem 0xcfbdc000-0xcfbdcfff irq 10 at device 3.0 on pci0
miibus0: <MII bus> on sis0
rlphy0: <RTL8201L 10/100 media interface> PHY 1 on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
sis0: Ethernet address: 00:0a:e6:8a:3b:e3
sis0: [ITHREAD]
atapci1: <Promise PDC20375 SATA150 controller> port 0xdc00-0xdc3f,0xd800-0xd80f,0xd400-0xd47f mem 0xcfbdf000-0xcfbdffff,0xcfba0000-0xcfbbffff irq 5 at device 13.0 on pci0
atapci1: [ITHREAD]
atapci1: [ITHREAD]
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci1
ata4: [ITHREAD]
atapci2: <Promise PDC40518 SATA150 controller> port 0xc400-0xc47f,0xc000-0xc0ff mem 0xcfbdb000-0xcfbdbfff,0xcfb60000-0xcfb7ffff irq 12 at device 15.0 on pci0
atapci2: [ITHREAD]
atapci2: [ITHREAD]
ata5: <ATA channel 0> on atapci2
ata5: [ITHREAD]
ata6: <ATA channel 1> on atapci2
ata6: [ITHREAD]
ata7: <ATA channel 2> on atapci2
ata7: [ITHREAD]
ata8: <ATA channel 3> on atapci2
ata8: [ITHREAD]
vgapci0: <VGA-compatible display> mem 0xcfc00000-0xcfffffff,0xcfbf0000-0xcfbfffff,0xcf400000-0xcf7fffff irq 11 at device 17.0 on pci0
uhci0: <VIA 83C572 USB controller> port 0xb800-0xb81f irq 11 at device 19.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb2: <VIA 83C572 USB controller> on uhci0
usb2: USB revision 1.0
uhub2: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
uhci1: <VIA 83C572 USB controller> port 0xbc00-0xbc1f irq 11 at device 19.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb3: <VIA 83C572 USB controller> on uhci1
usb3: USB revision 1.0
uhub3: <VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3
uhub3: 2 ports with 2 removable, self powered
ehci0: <VIA VT6202 USB 2.0 controller> mem 0xcfbdaf00-0xcfbdafff irq 5 at device 19.2 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb4: EHCI version 0.95
usb4: companion controllers, 2 ports each: usb2 usb3
usb4: <VIA VT6202 USB 2.0 controller> on ehci0
usb4: USB revision 2.0
uhub4: <VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4
uhub4: 4 ports with 4 removable, self powered
acpi_button0: <Sleep Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
fdc0: <floppy drive controller> port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A
sio0: [FILTER]
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
sio1: [FILTER]
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xcb800-0xd07ff,0xd0800-0xd87ff pnpid ORM0000 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
ppc0: [GIANT-LOCKED]
ppc0: [ITHREAD]
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
WARNING: ZFS is considered to be an experimental feature in FreeBSD.
Timecounter "TSC" frequency 1659614133 Hz quality 800
Timecounters tick every 1.000 msec
hptrr: no controller detected.
ZFS filesystem version 6
ZFS storage pool version 6
ad0: 114473MB <WDC WD1200BB-00CJA1 17.07W17> at ata0-master UDMA100
ad4: 715404MB <Seagate ST3750640AS 3.AAD> at ata2-master SATA150
ad8: 239372MB <Maxtor 6Y250P0 YAR41BW0> at ata4-master UDMA133
ad10: 238475MB <Hitachi HDS722525VLSA80 V36OA6MA> at ata5-master SATA150
ad12: 238475MB <Hitachi HDS722525VLSA80 V36OA6MA> at ata6-master SATA150
ad14: 238475MB <Hitachi HDS722525VLSA80 V36OA6MA> at ata7-master SATA150
ad16: 238475MB <Hitachi HDS722525VLSA80 V36OA6MA> at ata8-master SATA150
Trying to mount root from zfs:tank

zfsserver# pciconf -vl
hostb0 at pci0:0:0:0:      class=0x060000 card=0x00000000 chip=0x07351039 rev=0x01 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS 735 Host-to-PCI Bridge'
    class      = bridge
    subclass   = HOST-PCI
pcib1 at pci0:0:1:0:       class=0x060400 card=0x00000000 chip=0x00011039 rev=0x00 hdr=0x01
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS730 Virtual PCI-to-PCI bridge (AGP)'
    class      = bridge
    subclass   = PCI-PCI
isab0 at pci0:0:2:0:       class=0x060100 card=0x00000000 chip=0x00081039 rev=0x00 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS PCI to ISA Bridge (LPC Bridge)'
    class      = bridge
    subclass   = PCI-ISA
ohci0 at pci0:0:2:2:       class=0x0c0310 card=0x70011039 chip=0x70011039 rev=0x07 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS5597/8 Universal Serial Bus Controller'
    class      = serial bus
    subclass   = USB
ohci1 at pci0:0:2:3:       class=0x0c0310 card=0x70011039 chip=0x70011039 rev=0x07 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS5597/8 Universal Serial Bus Controller'
    class      = serial bus
    subclass   = USB
atapci0 at pci0:0:2:5:     class=0x010180 card=0x55131039 chip=0x55131039 rev=0xd0 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS5513 EIDE Controller (A,B step)'
    class      = mass storage
    subclass   = ATA
none0 at pci0:0:2:7:       class=0x040100 card=0x030013f6 chip=0x70121039 rev=0xa0 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS7012 PCI Audio Accelerator'
    class      = multimedia
    subclass   = audio
sis0 at pci0:0:3:0:        class=0x020000 card=0x09001039 chip=0x09001039 rev=0x90 hdr=0x00
    vendor     = 'Silicon Integrated Systems (SiS)'
    device     = 'SiS900 sis 900 and integrated lan'
    class      = network
    subclass   = ethernet
atapci1 at pci0:0:13:0:    class=0x018000 card=0x3375105a chip=0x3375105a rev=0x02 hdr=0x00
    vendor     = 'Promise Technology Inc'
    device     = 'PDC20375(??) FastTrak SATA150 TX2plus Controller'
    class      = mass storage
atapci2 at pci0:0:15:0:    class=0x018000 card=0x3d18105a chip=0x3d18105a rev=0x02 hdr=0x00
    vendor     = 'Promise Technology Inc'
    device     = 'Promise SATAII150 518 (tm) IDE Controller'
    class      = mass storage
vgapci0 at pci0:0:17:0:    class=0x030000 card=0x00000000 chip=0x96601023 rev=0xd3 hdr=0x00
    vendor     = 'Trident Microsystems'
    device     = 'TGUI9660XGi/968x/938x GUI Accelerator'
    class      = display
    subclass   = VGA
uhci0 at pci0:0:19:0:      class=0x0c0300 card=0x12340925 chip=0x30381106 rev=0x50 hdr=0x00
    vendor     = 'VIA Technologies Inc'
    device     = 'VT83C572, VT6202 VIA Rev 5 or later USB Universal Host Controller'
    class      = serial bus
    subclass   = USB
uhci1 at pci0:0:19:1:      class=0x0c0300 card=0x12340925 chip=0x30381106 rev=0x50 hdr=0x00
    vendor     = 'VIA Technologies Inc'
    device     = 'VT83C572, VT6202 VIA Rev 5 or later USB Universal Host Controller'
    class      = serial bus
    subclass   = USB
ehci0 at pci0:0:19:2:      class=0x0c0320 card=0x12340925 chip=0x31041106 rev=0x51 hdr=0x00
    vendor     = 'VIA Technologies Inc'
    device     = 'VT6202/12 USB 2.0 Enhanced Host Controller'
    class      = serial bus
    subclass   = USB


zfsserver# atacontrol list
ATA channel 0:
    Master:  ad0 <WDC WD1200BB-00CJA1/17.07W17> ATA/ATAPI revision 5
    Slave:       no device present
ATA channel 1:
    Master:      no device present
    Slave:       no device present
ATA channel 2:
    Master:  ad4 <ST3750640AS/3.AAD> Serial ATA v1.0
    Slave:       no device present
ATA channel 3:
    Master:      no device present
    Slave:       no device present
ATA channel 4:
    Master:  ad8 <Maxtor 6Y250P0/YAR41BW0> ATA/ATAPI revision 7
    Slave:       no device present
ATA channel 5:
    Master: ad10 <HDS722525VLSA80/V36OA6MA> Serial ATA v1.0
    Slave:       no device present
ATA channel 6:
    Master: ad12 <HDS722525VLSA80/V36OA6MA> Serial ATA v1.0
    Slave:       no device present
ATA channel 7:
    Master: ad14 <HDS722525VLSA80/V36OA6MA> Serial ATA v1.0
    Slave:       no device present
ATA channel 8:
    Master: ad16 <HDS722525VLSA80/V36OA6MA> Serial ATA v1.0
    Slave:       no device present


zfsserver# smartctl -l error /dev/ad14

Error 4 occurred at disk power-on lifetime: 14449 hours (602 days + 1 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 b2 07 c7 e0  Error: ICRC, ABRT at LBA = 0x00c707b2 = 13043634

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 40 73 f7 c6 ec 00   1d+23:59:15.200  READ DMA EXT
  c6 00 10 00 00 00 e0 00   1d+23:59:15.200  SET MULTIPLE MODE
  ef 02 00 00 00 00 e0 00   1d+23:59:15.200  SET FEATURES [Enable write cache]
  ef aa 00 00 00 00 e0 00   1d+23:59:15.200  SET FEATURES [Enable read look-ahead]
  ef 03 45 00 00 00 e0 00   1d+23:59:15.200  SET FEATURES [Set transfer mode]

Error 3 occurred at disk power-on lifetime: 14448 hours (602 days + 0 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 00 0c bd c2 ec  Error: ICRC, ABRT at LBA = 0x0cc2bd0c = 214088972

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  35 00 80 8d bc c2 e0 00   1d+23:31:28.300  WRITE DMA EXT
  35 00 80 0d bc c2 e0 00   1d+23:31:28.300  WRITE DMA EXT
  35 00 80 8d bb c2 e0 00   1d+23:31:28.300  WRITE DMA EXT
  35 00 80 0d bb c2 e0 00   1d+23:31:28.200  WRITE DMA EXT
  35 00 80 8d ba c2 e0 00   1d+23:31:28.200  WRITE DMA EXT


error 4 is the one that resulted in a panic. error 3 is the one that resulted in the drive going away and requiring the reboot.  errors 1 and 2 are the same as error 3 and all happened yesterday.  Yesterday I moved the computer into a different case.  Prior to that a different drive (same model) was occasionally having the same problem.  This leads me to believe that it's not a hard drive issue, but as they are all the same model and purchased at the same time I can't say that for sure.

When this happened before I tried moving the drive onto my other sata controller and had the same results.  Both are made by promise so it's possible that it wasn't a useful test to determine if it is a driver issue..


>Description:
	ad14 had disapeared as shown by the following in /var/log/messages:
Dec 29 01:57:21 zfsserver kernel: ad14: FAILURE - device detached
Dec 29 01:57:21 zfsserver kernel: subdisk14: detached
Dec 29 01:57:21 zfsserver kernel: ad14: detached
Dec 29 01:57:22 zfsserver root: ZFS: vdev failure, zpool=data type=vdev.open_failed

I tried doing an atacontrol reinit ata7 to rediscover the drive, but that didn't find it, so I rebooted to bring it back.
Then I ran a zpool scrub to check that the data was all happy.  A couple minutes into it the kernel paniced.

	ad14 is connected to "Promise SATAII150 518 (tm) IDE Controller"

	Last few lines from /var/log/messages:

Dec 29 02:24:08 zfsserver kernel: ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
Dec 29 02:24:12 zfsserver kernel: ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly
Dec 29 02:24:16 zfsserver kernel: ad14: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly
Dec 29 02:24:20 zfsserver kernel: ad14: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly
Dec 29 02:24:24 zfsserver kernel: ad14: WARNING - SET_MULTI taskqueue timeout - completing request directly
Dec 29 02:24:24 zfsserver kernel: ad14: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=482801523
Dec 29 02:24:24 zfsserver kernel: ad14: WARNING - READ_DMA48 UDMA ICRC error (retrying request) LBA=482801523
Dec 29 02:24:24 zfsserver root: ZFS: checksum mismatch, zpool=data path=/dev/ad14 offset=247190218240 size=32768
Dec 29 02:24:29 zfsserver kernel: ad14: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=482801651

# kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.25
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
ad14: FAILURE - device detached
subdisk14: detached
ad14: detached


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x2c
fault code              = supervisor write, page not present
instruction pointer     = 0x20:0xc0632e75
stack pointer           = 0x28:0xef33bc5c
frame pointer           = 0x28:0xef33bc70
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 3 (g_up)
trap number             = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper(c0984eeb,ef33baf8,c063f33f,c09a366c,0,...) at db_trace_self_wrapper+0x26
kdb_backtrace(c09a366c,0,c09649c3,ef33bb04,0,...) at kdb_backtrace+0x29
panic(c09649c3,c09a4913,c3f544d0,1,1,...) at panic+0x10f
trap_fatal(c0a65020,0,2,8,dd313180,...) at trap_fatal+0x333
trap_pfault(c0a64ac8,ef33bb90,c066d3dd,ef33bbb4,c,...) at trap_pfault+0x250
trap(ef33bc1c) at trap+0x3c6
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc0632e75, esp = 0xef33bc5c, ebp = 0xef33bc70 ---
_mtx_lock_flags(1c,0,c0bf6e0d,1d8,c0bec2a0,...) at _mtx_lock_flags+0x15
vdev_geom_io_intr(c4e4f7bc,c0a17e04,0,0,0) at vdev_geom_io_intr+0x44
biodone(c4e4f7bc,c0a64a28,24c,c097d445,64,...) at biodone+0xad
g_io_schedule_up(c3f0dc60,4c,c097e119,5b,0,...) at g_io_schedule_up+0x7f
g_up_procbody(0,ef33bd38,0,ffffffff,ffffffff,...) at g_up_procbody+0x6c
fork_exit(c05eea20,0,ef33bd38) at fork_exit+0x97
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xef33bd70, ebp = 0 ---
Uptime: 8m0s
Physical memory: 1011 MB
Dumping 258 MB: 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3


I'm not sure what else to report.


>How-To-Repeat:
	I can't reproduce it. :-(

>Fix:

	unknown


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list