LSI2008 controller clobbers first disk with new LSI mps driver

Jason Wolfe nitroboost at gmail.com
Mon Mar 19 18:52:19 UTC 2012


Kashyap/LSI,

Any movement here?  I'm also trying to test some bug fixes to the em
driver in 8.3-PRERELEASE, but am still locked into the old mps driver
from last year because of this issue.  If this is something we need to
take offline I can start up an internal thread if that works.

Thanks again,

Jason

On Thu, Mar 8, 2012 at 11:26 AM, Douglas Gilbert <dgilbert at interlog.com> wrote:
> Kashyap,
> Backing up ... I thought this thread was about the mps
> driver failing to do the SAS discover process properly.
> Hence a disk did not appear because it was "hidden" by a
> SES device which had the same (device?) slot number.
>
> The SAS discover process is described in section 4.2 of
> spl2r04b.pdf (at t10.org). That is the latest draft. Please
> note that section never mentions the word "slot". I would
> hazard a guess that no SAS standard or draft has ever
> mentioned slots in the context of the discover process.
>
> The concept of device "slot" *** numbers comes from the
> SCSI Enclosure Services (SES) standards of which ses3r04.pdf
> is the latest draft. SAS provides the slot number _optionally_
> in the long form SMP DISCOVER response and does _not_ provide
> the device slot number in the SMP DISCOVER LIST short form
> response.
>
> So IMO the device slot number is just a bit of helpful
> information that SAS might provide and that slot
> number should not interfere with the SAS discover
> process.
>
>
> *** the term "slot" is used in the SAS port layer state
>    machine in a different context. It is also possible
>    that "slot" is a term used in LSI firmware.
>
>
> A few data points: I have an Intel RES2SV240 which contains
> a LSI SAS2X24 expander and a HP Expander card which contains
> a PMC Sierra PM8005 SAS-2 expander. Both report a device slot
> number of 255 (i.e. not provided) via their SMP DISCOVER
> responses. When the inbuilt SES device on each expander is
> probed, the LSI part reports device slot numbers 0 through 23
> while the PMC part reports a device slot number of 0 for all
> array devices. In both cases the SES device itself is not
> listed amongst the SES "array device slot" elements.
>
> Doug Gilbert
>
>
>
> On 12-03-07 12:44 PM, Desai, Kashyap wrote:
>>
>> Jason,
>>
>> We discuss this issue with our architect and he has strong recommendation
>> not to provide any work-around where Enclosure configuration is not correct.
>> Similar issue was reported by other customer sometimes back and they have
>> also configured their Enclosure to resolve this issue.
>> "The enclosure configuration needs to be fixed so it advertises enough
>> slots (phys disks + num of SES devices) and it places the SES devices
>> (assigned slot numbers) above the physical disks."
>>
>>
>> ` Kashyap
>>
>>
>>> -----Original Message-----
>>> From: Desai, Kashyap
>>> Sent: Wednesday, February 29, 2012 10:08 PM
>>> To: 'dgilbert at interlog.com'; Jason Wolfe
>>> Cc: freebsd-scsi at freebsd.org; McConnell, Stephen
>>> Subject: RE: LSI2008 controller clobbers first disk with new LSI mps
>>> driver
>>>
>>> Hi Jason,
>>>
>>> I have started discussion with LSI internal folks to get better clarity
>>> on this issue. Since our key person is on vacation, we may get clarity
>>> on this next week.
>>> I cannot provide some temporary workaround in upstream(because this is
>>> against our design), but if you want to use for your environment, I can
>>> provide you some temporary patch.
>>>
>>> Doug,
>>>
>>> Thanks for providing your view and I have convey this to our architect.
>>>
>>> ~ Kashyap
>>>
>>>> -----Original Message-----
>>>> From: Douglas Gilbert [mailto:dgilbert at interlog.com]
>>>> Sent: Tuesday, February 28, 2012 5:11 AM
>>>> To: Jason Wolfe
>>>> Cc: Desai, Kashyap; freebsd-scsi at freebsd.org; McConnell, Stephen
>>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI mps
>>>> driver
>>>>
>>>> On 12-02-27 02:59 PM, Jason Wolfe wrote:
>>>>>
>>>>> On Wed, Feb 22, 2012 at 9:11 AM, Desai,
>>>
>>> Kashyap<Kashyap.Desai at lsi.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Douglas Gilbert [mailto:dgilbert at interlog.com]
>>>>>>> Sent: Wednesday, February 22, 2012 8:52 PM
>>>>>>> To: Desai, Kashyap
>>>>>>> Cc: Jason Wolfe; freebsd-scsi at freebsd.org; McConnell, Stephen
>>>>>>> Subject: Re: LSI2008 controller clobbers first disk with new LSI
>>>
>>> mps
>>>>>>>
>>>>>>> driver
>>>>>>>
>>>>>>> On 12-02-22 03:39 AM, Desai, Kashyap wrote:
>>>>>>>>
>>>>>>>> Here is a possible root cause of this issue.
>>>>>>>>
>>>>>>>> Enclosure which you are using in your setup (might be) not
>>>>
>>>> configured
>>>>>>>
>>>>>>> properly.
>>>>>>>>
>>>>>>>>
>>>>>>>> You have Enclosure with 12 Slots + 1 SES Device.
>>>>>>>> See below detail from the log.
>>>>>>>>
>>>>>>>>      EventDataLength: 5
>>>>>>>>      AckRequired: 0
>>>>>>>>      Event: SasEnclDeviceStatusChange (0x1d)
>>>>>>>>      EventContext: 0x0
>>>>>>>>      EnclosureHandle: 0x2
>>>>>>>>      ReasonCode: Added
>>>>>>>>      PhysicalPort: 0
>>>>>>>>      NumSlots: 13
>>>>>>>>      StartSlot: 0
>>>>>>>>      PhyBits: 0xff
>>>>>>>>
>>>>>>>> StartSlot is 0 in this case.
>>>>>>>> Correct behavior should be each device on your enclosure must
>>>
>>> have
>>>>>>>
>>>>>>> different slot number starting from 0 till 12.
>>>>>>>>
>>>>>>>> I have doubt that SES device has not configured well and it is
>>>>
>>>> using
>>>>>>>
>>>>>>> slot-0 as default. This can create issue for actual device which
>>>
>>> is
>>>>>>>
>>>>>>> connected to slot-0.
>>>>>>>>
>>>>>>>> So In your setup you will have slot-0 till slot-11 assigned for
>>>>
>>>> actual
>>>>>>>
>>>>>>> Phys of your enclosures and again slot-0 is assigned for SES
>>>
>>> device
>>>>>>>
>>>>>>> instead of Slot-12.
>>>>>>>
>>>>>>> No. SAS-2 expanders typically have an integral SES device on an
>>>>>>> expander _virtual_ phy (see SMP DISCOVER (LIST) response). Once
>>>>>>> you see that virtual phy flag the slot number is irrelevant.
>>>>>>
>>>>>>
>>>>>> Doug,
>>>>>>
>>>>>> I need some more info so that I can understand your point better.
>>>>>>
>>>>>> I have one Enclosure setup on FreeBSD. Here is smp_discover output.
>>>>
>>>> (smp_discover_list is failing for me)
>>>>>>
>>>>>>
>>>>>> phy   0: inaccessible (phy vacant)
>>>>>>   phy   1: inaccessible (phy vacant)
>>>>>>   phy   2: inaccessible (phy vacant)
>>>>>>   phy   3: inaccessible (phy vacant)
>>>>>>   phy   4:S:attached:[500605b012345888:03  i(SSP+STP+SMP)]  6 Gbps
>>>>>>   phy   5:S:attached:[500605b012345888:02  i(SSP+STP+SMP)]  6 Gbps
>>>>>>   phy   6:S:attached:[500605b012345888:01  i(SSP+STP+SMP)]  6 Gbps
>>>>>>   phy   7:S:attached:[500605b012345888:00  i(SSP+STP+SMP)]  6 Gbps
>>>>>>   phy  12:D:attached:[5000c5003bc2c389:00  t(SSP)]  6 Gbps
>>>>>>   phy  13:D:attached:[500000e116ee91e2:00  t(SSP)]  6 Gbps
>>>>>>   phy  14:D:attached:[5000c5003bc308e5:00  t(SSP)]  6 Gbps
>>>>>>   phy  15:D:attached:[5000c5003bc2f0d1:00  t(SSP)]  6 Gbps
>>>>>>   phy  16:D:attached:[5000c5003bc2ff3d:00  t(SSP)]  6 Gbps
>>>>>>   phy  17:D:attached:[5000c5003bae5fdd:00  t(SSP)]  6 Gbps
>>>>>>   phy  18:D:attached:[5000c5003bae5eb1:00  t(SSP)]  6 Gbps
>>>>>>   phy  19:D:attached:[5000c5003bc2d135:00  t(SSP)]  6 Gbps
>>>>>>   phy  20:D:attached:[5000c5003baea36d:00  t(SSP)]  6 Gbps
>>>>>>   phy  21:D:attached:[5000c5003bc2a8c9:00  t(SSP)]  6 Gbps
>>>>>>   phy  22:D:attached:[5000c5003bc237a9:00  t(SSP)]  6 Gbps
>>>>>>   phy  23:D:attached:[5000c5003bc2cec1:00  t(SSP)]  6 Gbps
>>>>>>   phy  24:D:attached:[500000e01d92cb52:00  t(SSP)]  3 Gbps
>>>>>>   phy  25:D:attached:[500000e01d74cfb2:00  t(SSP)]  3 Gbps
>>>>>>   phy  26:D:attached:[500000e01d656052:00  t(SSP)]  3 Gbps
>>>>>>   phy  27:D:attached:[500000e01d7cad52:00  t(SSP)]  3 Gbps
>>>>>>   phy  28:D:attached:[500c04f2b64cdd1c:00  t(SATA)]  3 Gbps
>>>>>>   phy  29:D:attached:[500c04f2b64cdd1d:00  t(SATA)]  3 Gbps
>>>>>>   phy  30:D:attached:[500000e01d73c262:00  t(SSP)]  3 Gbps
>>>>>>   phy  31:D:attached:[500000e01d536b22:00  t(SSP)]  3 Gbps
>>>>>>   phy  32:D:attached:[500000e01d92cab2:00  t(SSP)]  3 Gbps
>>>>>>   phy  33:D:attached:[500000e01afd8792:00  t(SSP)]  3 Gbps
>>>>>>   phy  34:D:attached:[5000c5003bc30301:00  t(SSP)]  6 Gbps
>>>>>>   phy  35:D:attached:[5000c5003bb09a69:00  t(SSP)]  6 Gbps
>>>>>>   phy  36:D:attached:[500c04f2b64cdd3d:00  V i(SSP) t(SSP)]  6
>>>
>>> Gbps<-
>>>>
>>>> -- This has virtual phy set.
>>>>>>
>>>>>>
>>>>>> What I understood from your explanation is if we have virt_phy
>>>
>>> field
>>>>
>>>> set, we should not trust slot for that entry.
>>>>>>
>>>>>> You are suggesting to use phy index instead of slot. Just for info:
>>>>
>>>> But how to see Slot details mapping with phy ?
>>>>
>>>> Kashyap,
>>>> I haven't written a SAS discover algorithm but there
>>>> must be plenty of examples out there. One way to do it
>>>> is to find all the phy_ids attached to targets, in this
>>>> case there are SAS (SSP) and SATA targets. Each SATA
>>>> target phy_id will correspond to one SATA disk (or could be an
>>>> ATAPI device (e.g. DVD/BD player)). The SSP targets are a
>>>> bit trickier because two (or more) phys could be connected
>>>> to the same target (either a wide port or multiple (target)
>>>> ports). With a wide port each component phy has the same
>>>> attached SAS address (so above you have a wide initiator
>>>> port (phy ids 4,5,6,7) but no wide target ports). If a
>>>> SAS disk has multiple target ports connected, FreeBSD
>>>> probably has a device node for each. So for each SCSI (SSP)
>>>> target port you need a REPORT LUNS command issued on LUN 0
>>>> (or the REPORT LUNS well known logical unit) to find the
>>>> LUs it contains. A device node is created for each LU.
>>>>
>>>> Anyway I'm sure many folks in LSI know the SAS discover
>>>> process better than I do. Ask them :-) Surely most of
>>>> the above is already done in your HBA's firmware.
>>>>
>>>>
>>>> BTW I don't think slot numbers are reliable and don't apply
>>>> to things on virtual phys so they will just cause you
>>>> problems when used in the discover process, as this thread
>>>> attests. The BIOS on LSI's HBAs does a discover process
>>>> but is only interested in bootable devices so SES devices
>>>> don't appear.
>>>>
>>>>
>>>> Doug Gilbert
>>>>
>>>>> Kashyap,
>>>>>
>>>>> Let me know if there are any changes agreed upon, I'm happy to test
>>>>> out patches as this is affecting a large number of our machines.  I
>>>>> can only imagine the same for others as they start to upgrade, as
>>>
>>> this
>>>>>
>>>>> is standard SuperMicro hardware.
>>>>>
>>>>> Thanks,
>>>>> Jason
>>>>>
>>
>>
>


More information about the freebsd-scsi mailing list