slow probe for ata channel with only an atapi master on it

Bruce Evans bde at zeta.org.au
Mon Jan 5 08:38:19 PST 2004


On Mon, 5 Jan 2004, Soren Schmidt wrote:

> It seems Divacky Roman wrote:
>
> > I have something similar but with different numbers of stat and error
> > stat=0x50 err=0x50 lsb=0x50 msb=0x50
> >
> > something should be added to ata_reset to understand this
>
> That will be difficult as stat=0x50 means <READY,SEEK_COMPLETE> and
> that is a valid status for a device that is *present*.

Er, this would be simple.  Just exit the loop when both statuses show
that the device is non-busy, as in my version.  stat=0x50 also means
that the device is not BUSY.  After exiting the loop, the other registers
can be checked and found to be bogus.

BTW, I've had 2 drives go bad lately.  The deadest one developed a few
bad sectors a few years ago and started failing on all i/o's this year.
It returns err=0x02, but ATA_ATA_IDENTIFY still works perfectly on it.
It used to be probed successfully until the check on err was added
recently.  I think the probe should succeed, so that ioctls work.
Someday there should be ioctls to ask it why it failed.  W98 has
interesting problems with this drive.  Its probe fails for both the
drive and a cdrom on the same channel.  The cdrom works normally with
the ata driver.

The other drive is undead.  It seems to fail to spin up sometimes, but
works perfectly if its probe succeeds and I start accessing it immediately,
but tends to fail if I don't access it for a while.  It now always fails
overnight.  I'm wondering if it spins down and then the spin up doesn't
work, and plan to try putting it in sleep modes intentionally.  The
driver handles its failure poorly.  The failure is usually hard (takes
several power cycles to recover from), to it gets "removed from
configuration" after several seconds or minutes of the system being
unusable because it is blocked on Giant.  Then removal usually causes
a null pointer panic.

> I just wish firmware writers would stick to the specs they are writing
> against, but I guess that is just asking too much these days :(

There seems to be nothing out of spec in the above.  There is no device
there, so reading its registers gives garbage.  However, the garbage
doesn't include the BUSY bit so it shouldn't delay the probe.  I
believe that the spec is properly kludged so that this is the usual
behaviour for non-present devices.  If the other device doesn't
respond then there is nothing to set BUSY, and if the other device
does respond then it tends to set BUSY to its own BUSY and the
phantom device becomes unBUSY at the same times as the real one.
Getting 0x50 in all the registers also makes some sense.  Apparently
the other device responds with its status register for all register
reads.

Bruce


More information about the freebsd-current mailing list