ata "fallback to PIO mode" on dual processor AMD systems

Francesco Casadei fcasadei at inwind.it
Fri Sep 24 08:51:08 PDT 2004


On Tue, Dec 31, 2002 at 03:57:16PM -0500, Bruce Campbell wrote:
> 
> I am seeing a problem with ata disks on 4 new systems, which
> I believe is either a bug in the ata driver, or a problem with
> the onboard IDE controller, or something else.  Systems are as follows:
> 
> Motherboard: ASUS A7M266-D
> CPUs       : 2 x 2000+ AMD MP
> Memory     : 2 x 512MB Crucial part: CT6472Y265
> 
> Disks (all UDMA100):
> 
>             Master                   Slave
> System 1:  WDC WD400BB             WDC WD1000BB
> System 2:  WDC WD400BB             WDC WD1000BB
> System 3:  WDC WD400BB             WDC WD800BB
> System 4:  WDC WD400BB             Maxtor 98196H8
> 
> Kernel : 4.7-RELEASE, custom kernel (compared to GENERIC):
> 
> commented out:
> 
>  cpu           I386_CPU
>  cpu           I486_CPU
> 
> enabled 
> 
>  options       SMP                     # Symmetric MultiProcessor Kernel
>  options       APIC_IO                 # Symmetric (APIC) I/O
> 
> 
> I am running a test with "dbench" (/usr/ports/benchmarks/dbench)
> with a script which runs:
> 
>   dbench 1
>   sleep for 5 minutes
>   dbench 2
>   sleep for 5 minutes
>   dbench 3
>   ...
> 
> to simulate 1,2,3... clients.
> 
> The following has happened on systems 2,3 and 4, after about 15 hours
> of running the test:
> 
> Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 -
> resetting
> Dec 30 23:26:59 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:26:59 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
> Dec 30 23:27:00 ecserv13 /kernel: ad0: WRITE command timeout tag=0 serv=0 
> resetting
> Dec 30 23:27:00 ecserv13 /kernel: ad0: timeout waiting for cmd=ef s=d0 e=00
> Dec 30 23:27:00 ecserv13 /kernel: ad0: trying fallback to PIO mode
> Dec 30 23:27:00 ecserv13 /kernel: ata0: resetting devices .. done
> 
> The test continues to run with the ata controller in PIO mode, with
> slower performance, and higher load average.
> 
> Once the master drops to PIO, attempts to access the slave then cause
> it to drop to PIO.
> 
> If I run:
> 
>   atacontrol mode 0 UDMA100 UDMA100
> 
> attempts to access either drive result in a delay until the controller
> drops to PIO, and then operations resume.  A soft reboot and things
> work in UDMA mode again.  Also tried UDMA33 and UDMA66 with no change.
> I also tried "atacontrol reinit 0" with no help.
> 
> Theories when I search the web for "fallback to PIO mode" include:
> 
>  - bad disks
>  - something to do with thermal recalibration
> 
> I don't believe the problems are bad disks, as the slave drops to PIO
> after the master does, and I can't get in back to UDMA, other than by
> soft reboot.  Plus I see the problem on 6 of 8 disks.
> 
> The problem is very repeatable.
> 
> Can anyone offer any ideas, or suggest investigative steps ?  I have a system
> in PIO mode right now.
> 
> Thanks,
> 
> -- 
> Bruce Campbell
> Engineering Computing
> CPH-2374B
> University of Waterloo
> (519)888-4567 ext 5889
> 
> ----------------------------------------
> This mail sent through www.mywaterloo.ca
> 
> To Unsubscribe: send mail to majordomo at FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
> 
> end of the original message

Same problem here, but slightly different configuration:

# atacontrol list
ATA channel 0:
    Master:  ad0 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present
ATA channel 1:
    Master: acd0 <LG CD-ROM CRD-8521B/1.03> ATA/ATAPI rev 0
    Slave:       no device present
ATA channel 2:
    Master:  ad4 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present
ATA channel 3:
    Master:  ad6 <IC35L040AVER07-0/ER4OA44A> ATA/ATAPI rev 5
    Slave:       no device present

ad4 and ad6 are attached to a Promise FastTrak 100 TX2 ATA RAID controller.

# atacontrol mode 0
Master = UDMA100 
Slave  = ???

# atacontrol mode 1
Master = PIO4 
Slave  = ???

# atacontrol mode 2
Master = UDMA100 
Slave  = ???

# atacontrol mode 3
Master = PIO4 
Slave  = ???

ad6 falls back to PIO mode on heavy I/O activity, i.e. when the system does a
level 0 file systems dump from the RAID 1 array (ad4,ad6) to the backup disk
ad0.
Rebooting and rebuilding the array with the Promise BIOS utility temporarily
solve the problem. The system may be up and running for 1-4 weeks doing a
level 0 dump every morning at 5:30am and then one day the drive ad6 falls back
to PIO mode again (little before the completion of fs dump).

Do the hard drives you are using support the ATA tagged queuing? And if so, do
you have TQ enbled?

	Francesco Casadei

-- 
You can download my public key from http://digilander.libero.it/fcasadei/
or retrieve it from a keyserver (pgpkeys.mit.edu, wwwkeys.pgp.net, ...)

Key fingerprint is: 1671 9A23 ACB4 520A E7EE  00B0 7EC3 375F 164E B17B

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hardware/attachments/20040924/59877854/attachment.bin


More information about the freebsd-hardware mailing list