Anthony's drive issues.Re: ssh password delay

Anthony Atkielski atkielski.anthony at wanadoo.fr
Tue Mar 22 07:19:59 PST 2005


Bart Silverstrim writes:

> Or it gave warnings that NT didn't.  Or it showed problems that NT
> didn't.

Unless someone can tell me what these messages mean, they are useless to
me, warnings or not.

> If it worked so well, why not put NT back on the machine and try
> running a battery of tests and diagnostics on the machine with NT to 
> see if it was masking a problem or not?

I don't have any T&D software for NT.

The smartctl utility looks very interesting, but I don't know to what
extent I can trust it.

> I don't remember ever losing data while using DOS...

I don't recall any specific instances of that, either.

> Really?  Windows' problems are fixed by the benevolent leader Allchin
> in MS?

No, but the support organizations of major vendors don't behave like
ill-mannered schoolboys when they are asked to show some responsibility
for their software.

> Or they could cover the other bases...that perhaps BSD is saying
> something that NT didn't report.

Rest assured, they will not do that.  They will cancel the FreeBSD
rollout and return to NT.  And I couldn't blame them.

> Yes, obviously FreeBSD is worthless.

Not worthless ... defective.  On this particular machine, the SCSI
problems are too serious to allow it to be used for any kind of
production.  I suppose a company could just swap out hardware over and
over until it got lucky.  Or it could run a different OS.

I've been a bit luckier on my production machine, which has brand-new
hardware.  Still, I'm getting mysterious SATA errors from time to time,
one of which crashed the system.  As usual, nobody has a clue as to
what's causing them, and once again, the Great Satan is the hardware.
Even though the only common point between the two machines is that they
are running FreeBSD, somehow four separate disk drives and two
controllers must be magically failing simultaneously.

> I know that I routinely scour the OS source code for problems when one
> of our cobbled workstations acts up under Windows.  Oh, wait, I was 
> just reminded that usually it's a network card or video card that goes
> bad in old systems that prevents them from working.  Or a bad memory 
> stick, or a processor that overheated.  But maybe those errors were 
> caused by a bad Windows driver?  I don't know.  Just know that when I 
> replace the part, the OS must suddenly have the bugs fixed in the 
> source code too.

Tell me the exact part that is failing when I get these SCSI errors.

> Really?  Here I thought that was called "troubleshooting", especially
> if the hardware in question is on the support hardware list and people
> on lists with the same hardware aren't having that problem.

People with the same hardware ARE having that problem.  But nobody ever
resolved it for them, either.  It was the usual story about how it must
be a hardware problem.

> Silently resetting without telling you?

I'm talking about the quirks that the drivers take into account for
different types of hardware.  FreeBSD, like many other operating
systems, _does_ contain special code to workaround hardware
idiosyncrasies, contrary to the implication that only Windows does this
and that it is somehow a Bad Thing.

> Then what are the errors you're getting again?

Well, they go on for pages.  I posted them here once (long, long ago).
Stuff like this ...

SSTAT0[0x5]:(DMADONE|SDONE) SSTAT1[0xa]:(PHASECHG|BUSFREE)
SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO)
SXFRCTL0[0x80]:(DFON) DFCNTRL[0x0] DFSTATUS[0x2d]:(FIFOEMP|DFTHRESH|HDONE|FIFOQWDEMP)
STACK: 0xe5 0x163 0x193 0x3
SCB count = 60
Kernel NEXTQSCB = 59
Card NEXTQSCB = 59
QINFIFO entries:
Waiting Queue entries:
Disconnected Queue entries: 13:45 4:17 11:4 0:8
QOUTFIFO entries:
Sequencer Free SCB List: 15 14 1 8 9 2 10 6 3 12 7 5
Sequencer SCB Info:
0 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x8]
1 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
2 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
3 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
4 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x11]
5 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
6 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
7 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
8 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
9 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
10 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
11 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x4]
12 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
13 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0x2d]
14 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
15 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
Pending list:
45 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0]
4 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0]
17 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0]
8 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x7] SCB_LUN[0x0]
Kernel Free SCB list: 16 47 31 49 5 35 15 39 12 25 38 26 3 48 7 34 43 57 36 11 28 30 9 42 46 29 58 56 23 24 32 14 40 13 2 6 27 1 0 20 10 33 41 44 19 37 18 22 21 55 54 53 52 51 50

<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
(pass0:ahc0:0:0:0): SCB 0x2d - timed out
sg[0] - Addr 0xce1fc40 : Length 4
(pass0:ahc0:0:0:0): Queuing a BDR SCB
(pass0:ahc0:0:0:0): Bus Device Reset Message Sent
ahc0: Timedout SCBs already complete. Interrupts may not be functioning.
(pass0:ahc0:0:0:0): no longer in timeout, status = 24b
ahc0: Bus Device Reset on A:0. 3 SCBs aborted

> Or you find someone with the same hardware setup and install the OS to
> see if it's reproducible.

Others have already encountered this error, on the same hardware, as I
recall from my newsgroup and Web searches on the subject.

In any case, finding an identical machine today would be problematic.

-- 
Anthony




More information about the freebsd-questions mailing list