8-STABLE won't boot with ZFSv28

Jeremy Chadwick freebsd at jdc.parodius.com
Thu Jun 2 07:51:21 UTC 2011


On Thu, Jun 02, 2011 at 09:53:58AM +0300, Alexander Motin wrote:
> Hi.
> 
> Holger Kipp wrote:
> > got the same messages over and over again - panic took some time:
> > 
> > unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
> > ata0: reinit done ..
> > ata0: reiniting channel ..
> > ata0: DISCONNECT requested
> > 
> > <short delay here>
> > 
> > ata0: p0: SATA connect time=0ms status=00000113
> > ata0: p1: SATA connect timeout status=00000000
> > ata0: reset tp1 mask=03 ostat0=00 ostat1=00
> > ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb
> > ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb
> > ata0: reset tp2 stat0=00 stat1=00 devices=0x30000
> > unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
> > ata0: reinit done ..
> > ata0: reiniting channel ..
> > ata0: DISCONNECT requested
> 
> I see two problems here:
>  1. "devices=0x30000" means that two ATAPI devices were detected instead
> of one. I can reproduce it also with other Intel chipsets. It looks like
> a hardware bug to me. It can be workarounded by reconnecting ATAPI
> device to even (2 or 4) SATA port, or connecting any other device there.
>  2. "DISCONNECT requested" means that controller reported PHY status
> change for some device on channel, triggering infinite retry. Unluckily
> I have no ICH9 board, while I can't reproduce it with ICH10 or above.
> 
> This patch should workaround the first problem in software:
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26
> Try it please and let's see if with some luck it do something about the
> second problem.

With regards to item #1: I don't see anything in the ICH9 errata that
indicates a silicon bug if the only device attached to the controller is
an ATAPI device and connected to SATA port 0 (presumably), or an
odd-numbered port?  If this problem exists on other ICHxx and/or ESBxx
chips, I sure would hope it'd be documented.

I haven't tried confirming it myself, but if need be I can set up a test
box with a SATA-based DVD drive hooked up to it + provide remote serial
console/etc. if it'd be of any help.  I don't think it would be (sounds
like you have lots of hardware :-) ), but I'm willing to help in any way
I can.

With regards to item #2: could this be at all related to OOB (bit 15)
somehow being set in PCS (SATA register offset 0x92)?  I'm doubting it
but I thought I'd ask.  My thought process, which is probably wrong
(consider it an educational discussion :-) ):

The ICH9 specification states that the default value for this register
is 0x0000, and b15=0 means "SATA controller will not retry after an OOB
failure", while b15=1 causes the controller to indefinitely retry after
OOB failure.  I imagine system BIOSes and other things can change this
default value, but we don't seem to print it anywhere in
ata_intel_chipinit() during a verbose boot.

Looking at chipsets/ata-intel.c, it looks like we only touch PCS in
ata_intel_chipinit() and ata_intel_reset().  In the former, we avoid
touching bits 4 through 15, and in the latter we mask out only what we
want to adjust (e.g. the SATA port per ch variable).

Reference material is 14.1.31 of the ICH9 datasheet:
http://www.intel.com/assets/pdf/datasheet/316972.pdf

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list