ATA READ command timeout (and worse)
Stephen McKay
smckay at internode.on.net
Mon Jun 23 11:45:47 PDT 2003
On Wednesday, 18th June 2003, Stephen McKay wrote:
>I recompiled the kernel with DDB. A few test runs and I got this:
>
>Jun 18 19:19:44 peon /kernel: ad4: no status, reselecting device
>Jun 18 19:19:44 peon /kernel: ad4: timeout sending command=c8 s=ff e=00
>Jun 18 19:19:44 peon /kernel: ad4: error executing command - resetting
>Jun 18 19:19:44 peon /kernel: ata2: resetting devices ..
>Jun 18 19:19:44 peon /kernel: ad4: removed from configuration
>Jun 18 19:19:44 peon /kernel: ad5: removed from configuration
>Jun 18 19:19:44 peon /kernel: done
>
>Fatal trap 12: page fault while in kernel mode
>fault virtual address = 0x63657865
After I compiled with INVARIANTS, my crash changes a little:
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0xdeadc0de
fault code = supervisor read, page not present
instruction pointer = 0x8:0xc012c7f0
stack pointer = 0x10:0xcd7fdbbc
frame pointer = 0x10:0xcd7fdbc8
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 397 (diff)
interrupt mask = bio
kernel: type 12 trap, code=0
Stopped at ad_detach+0x34: cmpl %esi,0(%ebx)
db> trace
ad_detach(c11ba12c,0) at ad_detach+0x34
ata_reinit(c11ba100,c11ba100,0,0,0) at ata_reinit+0x86
ad_transfer(c1302280) at ad_transfer+0x49c
ata_start(c11ba100,0,c12468a4,c651b860,c1246958) at ata_start+0x98
adstrategy(c651b860,c651b860,cd05e780,cd7fdc74,c0192262) at adstrategy+0x95
diskstrategy(c651b860,c1253800,c651b860,c1293a00,cd7fdc80) at diskstrategy+0x95
...
The "0xdeadc0de" address implies the illegal reuse of a freed structure.
It looks (after poking about a bit in DDB) that ad_detach is reusing
a datastructure after it has been freed. And indeed, it is.
After the ad_free(request) call, the request->chain field is used implicitly
in the TAILQ_FOREACH() macro, causing hideous painful death. Solution is
to do the TAILQ_FOREACH() manually, and a little more carefully. Then I'll
be able to see if having my disks disappear is recoverable.
Woo hoo! First bug found. I'll see if I can find more. :-)
Stephen.
More information about the freebsd-hardware
mailing list