siisch0: Timeout on slot 30

Jeremy Chadwick freebsd at jdc.parodius.com
Wed Apr 13 08:59:40 UTC 2011


On Wed, Apr 13, 2011 at 08:33:48AM +0100, Mailing Lists wrote:
> Morning All, 
> 
> 
> I see there has been a few threads relating to something similar, specifically around port multiplier time outs using the SIIS module, with a patch even provided at one point - however this patch doesnt appear to be valid any more, 2 of the 3 chunks are already in the 8.2 release and from checking the cvs logs; when i try to apply, avoiding the roll back, 1 chunk does apply. I think the patch does make some difference, in that it doesnt make the disks stall for as long as frequently but the errors still appear in the logs. 
> 
> Currently running FreeBSD 8.2-RELEASE, with the zfs v28 patch. The main chassis has a Silicon Image 3124 in, with an estata port - connected to the esata port is another disk shelf with 5 disks in, all in one raidz pool. When i run a scrub on the external esata shelf, i see time outs in my logs: 
> 
> 
> 
> siisch0: Error while READ LOG EXT 
> siisch0: Error while READ LOG EXT 
> siisch0: Timeout on slot 30 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 04440000 
> siisch0: Timeout on slot 26 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 00440000 
> siisch0: Timeout on slot 22 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: ... waiting for slots 00040000 
> siisch0: Timeout on slot 18 
> siisch0: siis_timeout is 03040000 ss 44440000 rs 44440000 es 00000000 sts 801f2040 serr 00280000 
> siisch0: Error while READ LOG EXT 
> siisch0: Error while READ LOG EXT 
> 
> 
> siisch0 is the port multiplier , the slot number seems crazy to me (as i dont have 30 slots, theres only 5 in the external shelf) - although im not sure if this references something else. I dont believe this is a hardware problem, ive tried with replacement devices/controller cards, still see the above. 
> 
> Anyone else seeing this or have any thoughts? 

I cannot help you with your problem, but I can help reduce your
confusion regarding "slot numbers seeming crazy".

Based on a analysis of the src/sys/dev/siis code, there is no 1:1
relation between a slot number and a disk/port/drive-bay/device.  "Slot"
in this context means something completely different; do not correlate
the two things[1].  I don't know what "slot" means in this context; mav@
will know for certain.

Take a look at src/sys/dev/siis/siis.h, specifically the "struct
siis_channel" structure.  You'll see there are multiple "slots"
(declared as struct siis_slot) per channel.  Each slot appears to have
its own identification number -- in the kernel printf(), it's referred
to as slot->slot.  Thus: channel->slot[X]->slot

Your controller has multiple channels -- and each channel can have up to
256 (0-255) slots.  Each slot has a separate DMA channel associated with
it, as well as a separate CCB, and a separate timeout value.

There can be up to 256 (0-255) "slots" assigned to an individual SATA
controller channel.  A channel on SATA usually correlates (1:1) to a
disk, but I don't know how port multipliers fit into the picture (I do
on a physical level, just not on a software level).  Controller channels
can also have their own DMA channel (see [1] again).

This is pretty cool from a performance perspective; I'd never looked at
siis until now.

Anyway, point being: slot != SATA port.  Hopefully that relieves your
concern there.


[1]: Anyone working with technology needs to accept the fact that there
are too many words in the English language that are synonyms for
"thing".  Common tech terms for such: index, slot, port, channel, bay,
tag, port, bus, volume, and even LUN (yes, as in SCSI LUN).  I'm
forgetting some commonly-used others.

When these terms are used with someone who lacks context of what the
term actually refers to (on a technical/software/hardware level),
confusion guaranteed.  Am I recommending the printf()s be changed?
Absolutely not.  Just know that slot != SATA port.

Regarding LUN: I still see people correlating (1:1) a LUN with a disk.
When you introduce these people to SANs, where a LUN often consists of
multiple devices (commonly disks), confusion is guaranteed.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-fs mailing list