sparc64/82261: DMA-support on Sparc64 broken

Marius Strobl marius at alchemy.franken.de
Wed Aug 10 20:30:18 GMT 2005


The following reply was made to PR sparc64/82261; it has been noted by GNATS.

From: Marius Strobl <marius at alchemy.franken.de>
To: sos at freebsd.org
Cc: Sebastian Koehler <acex5 at thrillkill.de>, freebsd-gnats-submit at freebsd.org
Subject: Re: sparc64/82261: DMA-support on Sparc64 broken
Date: Wed, 10 Aug 2005 22:23:36 +0200

 On Wed, Jun 15, 2005 at 09:14:32AM +0000, Sebastian Koehler wrote:
 > 
 > >Number:         82261
 > >Category:       sparc64
 > >Synopsis:       DMA-support on Sparc64 broken
 > >Confidential:   no
 > >Severity:       serious
 > >Priority:       high
 > >Responsible:    freebsd-sparc64
 > >State:          open
 > >Quarter:        
 > >Keywords:       
 > >Date-Required:
 > >Class:          sw-bug
 > >Submitter-Id:   current-users
 > >Arrival-Date:   Wed Jun 15 09:20:16 GMT 2005
 > >Closed-Date:
 > >Last-Modified:
 > >Originator:     Sebastian Koehler
 > >Release:        6.0-CURRENT-SNAP004
 > >Organization:
 > >Environment:
 > FreeBSD  6.0-20050601-SNAP FreeBSD 6.0-20050601-SNAP #0: Thu Jun  2 05:29:17 UTC 2005     root at u60.samsco.home:/usr/obj/usr/src/sys/GENERIC  sparc64
 > >Description:
 > A clean installation using FreeBSD media cause errors when DMA mode is used to access the IDE disks.
 > 
 > messages during installation:
 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570528
 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2570624
 > ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=2574752
 > 
 > if system is installed using hw.ata.ata_dma=0 the following happens, when system is booted with DMA enabled:
 > dc1: Ethernet address: 00:03:ba:0f:22:55
 > dc1: if_start running deferred for Giant
 > dc1: [GIANT-LOCKED]
 > pci0: <serial bus, USB> at device 10.0 (no driver attached)
 > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0
 > ata2: <ATA channel 0> on atapci0
 > ata3: <ATA channel 1> on atapci0
 > stray level interrupt 14
 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0
 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0
 > uart0: console (9600,n,8,1)
 > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0
 > Timecounters tick every 1.000 msec
 > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66
 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33
 > Trying to mount root from ufs:/dev/ad0a
 > /libexec/ld-elf.so.1: /lib/libncurses.so.5: invalid file format
 > Enter full pathname of shell or RETURN for /bin/sh: 
 > 
 > or:
 > dc1: Ethernet address: 00:03:ba:0f:22:55
 > dc1: if_start running deferred for Giant
 > dc1: [GIANT-LOCKED]
 > pci0: <serial bus, USB> at device 10.0 (no driver attached)
 > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0
 > ata2: <ATA channel 0> on atapci0
 > ata3: <ATA channel 1> on atapci0
 > stray level interrupt 14
 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0
 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0
 > uart0: console (9600,n,8,1)
 > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0
 > Timecounters tick every 1.000 msec
 > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master UDMA66
 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33
 > Trying to mount root from ufs:/dev/ad0a
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > init in malloc(): error: recursive call
 > 
 > system is working fine with hw.ata.ata_dma=0:
 > dc1: Ethernet address: 00:03:ba:0f:22:55
 > dc1: if_start running deferred for Giant
 > dc1: [GIANT-LOCKED]
 > pci0: <serial bus, USB> at device 10.0 (no driver attached)
 > atapci0: <AcerLabs M5229 UDMA66 controller> port 0x10200-0x10207,0x10218-0x1021b,0x10210-0x10217,0x10208-0x1020b,0x10220-0x1022f at device 13.0 on pci0
 > ata2: <ATA channel 0> on atapci0
 > ata3: <ATA channel 1> on atapci0
 > rtc0: <Real Time Clock> at port 0x70-0x71 on isa0
 > uart0: <16550 or compatible> at port 0x3f8-0x3ff irq 43 on isa0
 > uart0: console (9600,n,8,1)
 > uart1: <16550 or compatible> at port 0x2e8-0x2ef irq 43 on isa0
 > Timecounters tick every 1.000 msec
 > ad0: 38166MB <Seagate ST340824A 3.28> at ata2-master PIO4
 > acd0: CDRW <RICOH CD-R/RW MP7200A/1.30> at ata3-master UDMA33
 > Trying to mount root from ufs:/dev/ad0a
 > Loading configuration files.
 > Entropy harvesting: interrupts ethernet point_to_point kickstart.
 > swapon: adding /dev/ad0b as swap device
 > Starting file system checks:
 > /dev/ad0a: FILE SYSTEM CLEAN; SKIPPING CHECKS
 > /dev/ad0a: clean, 102079 free (975 frags, 12638 blocks, 0.8% fragmentation)
 > /dev/ad0e: FILE SYSTEM CLEAN; SKIPPING CHECKS
 > /dev/ad0e: clean, 127341 free (29 frags, 15914 blocks, 0.0% fragmentation)
 > /dev/ad0f: FILE SYSTEM CLEAN; SKIPPING CHECKS
 > /dev/ad0f: clean, 17986047 free (4295 frags, 2247719 blocks, 0.0% fragmentation)
 > /dev/ad0d: FILE SYSTEM CLEAN; SKIPPING CHECKS
 > /dev/ad0d: clean, 127258 free (42 frags, 15902 blocks, 0.0% fragmentation)
 > >How-To-Repeat:
 > Try to access IDE drives in a Sun Netra X1 using DMA mode. Tested FreeBSD installation media 5.3-RELEASE and 6.0-CURRENT-SNAP004. Earlier releases no testet.
 > >Fix:
 > If DMA is not used (hw.ata.ata_dma=0 in bootloader) the messages go away and access to HDD is possible without errors, but only in PIO4.
 
 
 Søren, could you please look into this? AFAIK you also have a
 Sun Netra X1. Like a couple of other Sun models these use an
 onboard AcerLabs M5229 rev. 0xc3 and at least the 'TIMEOUT -
 WRITE_DMA retrying' warnings haven been reported for pretty
 much all of them, it seems much less likely to experience them
 with the original Sun supplied disks though. On the other hand
 there are a few reports like <200508071916.50197.Chris at LainOS.org>
 on freebsd-current@ and this PR that the ATA disks aren't
 useable at all. The problems seem to have started some time
 in the earlier 5.x days but an exact date isn't know and are
 still persistent after ATA mkIII. AFAICT the problems are also
 limited to UDMA66 and don't happen when restricting to UDMA33.
 Given that this also affects a couple of other models like
 the AX1105, Blade 100, Fire V100, etc. and it's not possible
 to plug in another controller on some of them this unfortunately
 is a show-stopper type of problem.
 The AcerLabs M5229 rev. 0xc3 are also know to suffer from a
 silicon bug that can cause data corruption but which doesn't
 seem to be the cause of the above mentioned problems (the
 workaround is to disable and re-enable the respective channel
 via the IDE interface control register of the accompanying
 ISA bridge on reset, see the audit-trail of the PR for a
 patch; the info is from OpenSolaris and an equivalent patch
 was also incorporated into NetBSD). The data corruption issue
 has been seen under FreeBSD in the past before other issues
 like the WRITE_DMA timeouts occured and only seems to happen
 ocassionally but not cause permanent problems like the other
 problems.
 
 Thanks.
 


More information about the freebsd-sparc64 mailing list