Automatic unit start broken?

Kenneth D. Merry ken at freebsd.org
Tue Feb 28 08:10:09 PST 2006


On Mon, Feb 27, 2006 at 21:43:27 +0100, Bernd Walter wrote:
> On Mon, Feb 27, 2006 at 01:22:54PM -0700, Kenneth D. Merry wrote:
> > On Mon, Feb 27, 2006 at 21:16:45 +0100, Bernd Walter wrote:
> > > It seems that FreeBSD doesn't start disks anymore.
> > 
> > That's strange, I don't think anything has changed in that area recently.
> 
> I don't think this happened recently.
> It is a new machine, but I already noticed several months back that I
> can't camcontrol stop a mounted drives anymore without getting problems
> on access later.
> 
> > > Disks which delayed start won't get probed by GEOM when they are not
> > > ready when GEOM tries - at least one can delay booting in this case.
> > > Disks which don't startup unless told never get working - I currently
> > > start them manually in rc-scripts and tell GEOM to reprobe, but this
> > > is not always an option, e.g. in case of / drive.
> > > All in all this is very annoying since I have not much options about
> > > changing the disk spin-up policy.
> > 
> > What error code do your disks return?  You will probably see some console
> > output if GEOM has tried to read metadata off the disk and that initial
> > read fails.
> > 
> > If the drive returns 0x04,0x02 ("Logical unit not ready, initializing cmd.
> > required"), CAM will attempt to spin the disk up automatically and retry
> > the command.
> 
> During the first tests I waited 90s in loader to let all delayed spin
> up drives spin up.
> This is with recent RELENG_6 and a drive which don't spin up themself:
> [...]
> da7 at esp1 bus 0 target 10 lun 0
> da7: <SEAGATE ST336706LC 8A03> Fixed Direct Access SCSI-3 device 
> da7: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> da7: Attempt to query device size failed: NOT READY, Logical unit not ready, initial

That's rather odd, since it looks like you've got an 0x04,0x02 response,
but the device must have rejected the start unit command if we failed to
get capacity information.

> [...]
> No GEOM message about this driver until rc sends a start command and
> GEOM is retriggered to reread the drive:
> Unit started successfully
> GEOM_LABEL: Label for provider da7 is ufs/dump1.
> The following commands were used in rc:
> camcontrol start -n da -u 7
> cat /dev/null > /dev/da7
> 
> Without the loader delay other disks are having problems as well:
> da9 at esp1 bus 0 target 14 lun 0
> da9: <IBM DDYS-T36950M S80D> Fixed Direct Access SCSI-3 device 
> da9: 20.000MB/s transfers (10.000MHz, offset 15, 16bit), Tagged Queueing Enabled
> da9: Attempt to query device size failed: NOT READY, Logical unit is in process of b
> 

That's a different error.  We won't send a start unit in that case.  The
error recovery action for 0x04,0x01 is to send a test unit ready every half
second for a minute until the device becomes ready.

Evidently it didn't become ready after that period of time.

> On Shell:
> [30]cicely19# dd if=/dev/da7 bs=1k count=1 of=/dev/null
> 1+0 records in
> 1+0 records out
> 1024 bytes transferred in 0.008765 secs (116829 bytes/sec)
> [31]cicely19# camcontrol stop -n da -u 7
> Unit stopped successfully
> [32]cicely19# dd if=/dev/da7 bs=1k count=1 of=/dev/null
> dd: /dev/da7: Input/output error
> 0+0 records in
> 0+0 records out
> 0 bytes transferred in 0.004810 secs (0 bytes/sec)
> Exit 1

What errors do you see on the console at that point?  In order for CAM to
automatically spin up the disk, it needs to send back 0x04,0x02 when it is
spun down, and it needs to actually spin up the disk in response to a start
unit.

What happens when you:

camcontrol stop da7
camcontrol tur da7 -v
camcontrol start da7 -v

Ken
-- 
Kenneth Merry
ken at FreeBSD.ORG


More information about the freebsd-scsi mailing list