sparc64/121539: Interrupt storm booting 7.0-R/sparc64 on ultra5
jpd at dsb.tudelft.nl
jpd at dsb.tudelft.nl
Thu Mar 13 00:50:03 UTC 2008
The following reply was made to PR sparc64/121539; it has been noted by GNATS.
From: jpd at dsb.tudelft.nl
To: Marius Strobl <marius at alchemy.franken.de>
Cc: bug-followup at freebsd.org
Subject: Re: sparc64/121539: Interrupt storm booting 7.0-R/sparc64 on ultra5
Date: Thu, 13 Mar 2008 01:13:39 +0100
On Wed, Mar 12, 2008 at 23:54:45 +0100, Marius Strobl wrote:
[snip!]
> Vector 2016 is the ATA controller and the ata(4)/acd(4) apparently
> has some problems accessing the CD. Could you please check whether
> the cabling and the drive are ok and functional?
Apologies for the narrative. The answer to your question is in the next
and the paragraphs before the last interrupt storm. The rest is me
attempting to be thorough. In short: Yes, overall I think they're ok.
I just checked and the cable on the hard drive end said 'click' when I
pushed on it on the drive's side. A marginal connection seems likely.
The other connections seem to be ok, if old ata33-only cables. The cdrom
I swapped with a then-new dvd drive (IE it's not sun-original) and it
should be ok. It was used for installing 5.4 and solaris 10 from dvd a
while back. The system has been mostly offline in the meantime.
I'd like to note that booting 5.4 (which I did before and after trying
to boot 7.0 for the first time) didn't have the problem, but 7.0 did,
both while booting from cdrom and from hard drive, so whether that was
an actual marginal connection, I guess we'll find out next (see below).
I probably should've made the connection between the one and the
other notice, altough not knowing what vector 2016 was, I substituted
ignorance and went ahead. I noticed that *eventually* it'll go through,
maybe prodded along by sending a couple of breaks, at which point I
rolled a 7.0 base+man over the previous one. Once it booted it stopped
complaining, mostly.
Then I checked out src and built a custom kernel. Installing it would
get me DMA errors when it got to the twe module, altough (again) brute
force eventually got around it.
On a lark I checked out the relevant bits of ports and installed
smartmontools, and ran an offline test. The output looked all green
except for a non-zero but low (14) Reallocated_Event_Count. So I think
the hard disk drive and presumably the dvd drive are in reasonable
shape.
While I'm writing this the machine sat twirling away just as it did
before, *very slowly* twirling away loading the kernel (which it did do
much faster even with the interrupt storm messages coming up later) and
eventually getting to a bootstage, but it will then panic. If this keeps
up after I get a fresh image on it, I'll ask for help about that.
Consoles: Open Firmware console
Booting with sun4u support.
FreeBSD/sparc64 bootstrap loader, Revision 1.0
(root at obrian.cse.buffalo.edu, Sun Feb 24 17:36:50 UTC 2008)
bootpath="/pci at 1f,0/pci at 1,1/ide at 3/disk at 0,0:a"
Loading /boot/defaults/loader.conf
/boot/kernel/kernel data=0x412648+0x5b2a8 syms=[0x8+0x59340+0x8+0x4e312]
/
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...
nothing to autoload yet.
jumping to kernel entry at 0xc0060000.
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.0-RELEASE #1: Tue Mar 11 21:11:39 UTC 2008
root at aquablue.local:/usr/src/sys/sparc64/compile/AQUABLUE
panic: trap: memory address not aligned
Uptime: 1s
I might very well have forgotten something important in the compile,
but I can't help but wonder why it started to load so slowly after I
installed my custom kernel. Compile #0 worked, though. I'll see what
happens when I get it to boot GENERIC again, compile again, and so
forth.
Now, long story short: I double-checked the connections, closed up
the case, and booted GENERIC from the install cd again. Booting with
hw.ata.atapi_dma=0 and .ata_dma=0 makes the interrupt storm go away,
altough it will still complain:
acd0: FAILURE - READ_BIG ILLEGAL REQUEST asc=0x64 ascq=0x00
GEOM_LABEL: Label for provider acd0 is iso9660/FreeBSD_Install.
acd0: FAILURE - READ_BIG ILLEGAL REQUEST asc=0x64 ascq=0x00
Only three lines though. atapi_dma=0 and ata_dma=1 does the same.
atapi_dma=1 and ata_dma=0 brings the interrupt storms back again.
While in an emergency shell booted with hw.ata.atapi_dma=0 I managed to
trigger an interrupt storm by accessing the cdrom (`ls') anyway:
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
ata2: reiniting channel ..
ata2: reset tp1 mask=03 ostat0=51 ostat1=00
ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata2: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata2: reset tp2 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ad0: setting PIO4 on CMD 646 chip
ad0: setting WDMA2 on CMD 646 chip
ata2: reinit done ..
ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=376800
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
ata2: reiniting channel ..
ata2: reset tp1 mask=03 ostat0=51 ostat1=00
ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata2: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata2: reset tp2 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ad0: setting PIO4 on CMD 646 chip
ad0: setting WDMA2 on CMD 646 chip
ata2: reinit done ..
ad0: TIMEOUT - READ_DMA retrying (0 retries left) LBA=376800
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
ata2: reiniting channel ..
ata2: reset tp1 mask=03 ostat0=51 ostat1=00
ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata2: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata2: reset tp2 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ad0: setting PIO4 on CMD 646 chip
ad0: setting WDMA2 on CMD 646 chip
ata2: reinit done ..
ad0: FAILURE - READ_DMA timed out LBA=376800
g_vfs_done():ad0a[READ(offset=192921600, length=16384)]error = 5
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
ata2: reiniting channel ..
ata2: reset tp1 mask=03 ostat0=51 ostat1=00
ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata2: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata2: reset tp2 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ad0: setting PIO4 on CMD 646 chip
ad0: setting WDMA2 on CMD 646 chip
ata2: reinit done ..
ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=376800
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
interrupt storm detected on "vec2016:"; throttling interrupt source
ata2: reiniting channel ..
ata2: reset tp1 mask=03 ostat0=51 ostat1=00
ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata2: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata2: reset tp2 stat0=50 stat1=00 devices=0x1<ATA_MASTER>
ad0: setting PIO4 on CMD 646 chip
ad0: setting WDMA2 on CMD 646 chip
ata2: reinit done ..
ad0: FAILURE - READ_DMA timed out LBA=376800
g_vfs_done():ad0a[READ(offset=192921600, length=16384)]error = 5
ls: firmware: Input/output error
ls: kernel.generic: Input/output error
ls: zfs: Input/output error
boot1 kernel/ loader.4th loader.rc
defaults/ kernel.5.4/ loader.conf modules/
device.hints loader* loader.help support.4th
Fixit# mount
/dev/md0 on / (ufs, local)
devfs on /dev (devfs, local)
/dev/acd0 on /dist (cd9660, local, read-only)
/dev/ad0a on /mnt (ufs, local)
/dev/ad0d on /mnt/usr/local (ufs, local, soft-updates)
Fixit#
I'm not sure why `zfs' reports an i/o error.
More information about the freebsd-sparc64
mailing list