svn commit: r308217 - in head/sys/dev: mpr mps

Alan Somers asomers at freebsd.org
Tue Nov 14 18:26:33 UTC 2017


On Thu, Nov 10, 2016 at 12:54 AM, Harry Schmalzbauer <freebsd at omnilan.de> wrote:
> Bezüglich Scott Long's Nachricht vom 09.11.2016 17:06 (localtime):
>>
>>> On Nov 4, 2016, at 3:18 AM, Harry Schmalzbauer <freebsd at omnilan.de> wrote:
>>>> If it's really mps(4) who decides to store driveserial-targetID
>>> numbering in the /"persitent non-manufacturing config pages/" of the
>>> controller, mpsutil(8) should be able to reset. Otherwise replacing
>>> failed drives, or - even mor confusing - rearranging drive/zpool layouts
>>> is very unsatisfying.
>>>
>>> Maybe "-1" should be mentioned with sysctl decription, otherwise this is
>>> another very hard to find/influence behaviour.
>>>
>>>
>>
>> Thanks for the feedback.  For the record, this problem happens on a
>> Supermicro X10SDV-7TP4F motherboard.  It appears that the support
>> logic around the LSI controller is mis-configured to show the SAS ports
>> being part of an enclosure with 0 slots, instead of 8.  This confuses
>> the device mapper logic in the driver that activates if the controller NVRAM
>> doesn’t specify a pre-existing mapping.  Typically this is not the default,
>> the NVRAM persistent mappings are the default and are used by the driver,
>> so I considered this problem to be unique to our deployment.  Maybe it’s
>> more of a problem than I estimated?  Anyways, sounds like this new
>
> I haven't had too much diversity regarding Fusion-MPT and this mapping
> problem has never hit me yet, so I can't help estimating the severity of
> that specific problem.
>
> But I think I haven't described clearly that I'm having da(4)-numbering
> problems which are not directly enclosure-mapping related (at least not
> related to the nvram mapping page), but which hopefully could be worked
> arround the same way:
>
> I frequently had problems replacing drives due to the eternal targetID
> assigning (every drive with a new/unknown serial gets a consecutive
> targetID regardless of the enclosure-slot or the number of attached drives).
> I guess this is stored in a completely different NVRAM page than the
> enclosure-mapping page.
> Your patch is intended to solve problems with invalid/absent
> enclosure-mapping page, but I guess I'll sooner or later need to try the
> "hw.mpr.use_phy_num=-1" sysctl
> to hopefully overwrite the targetID++ assigning, which causes "wholes"
> every time a drive gets replaced.
> In that case it's just a cosmetic problem, but when rearranging old and
> new drives on the same controller, it can cause severe confusion for the
> admins – leading to fatal mistakes. And migrating disks to new
> controller/chassis is even more problematic if the host had hard wired
> da(4) assignins via device.hints.
>
> I couldn't search the driver to find out if the "save eternal targetID
> in nvram page N" is really present and not firmware-only induced, but
> since I saw a different behaviour on windows, I guess it is.
> I could circumvent the problem by simply using IR firmware since it is
> only active when mps(4) runs IT firmware.
>
> But having a way to disable "save eternal targetID in nvram page N" for
> mps(4)-IT (via sysctl, and possibly for mpr/mpt also) would be very welcome.
>
> Top on my wishlist was extending mpsutil(8) to be able to list and
> selectively delete single serial-targetID mappings, but I haven't even
> found a way to do that with any vendor provided tool, not even with
> LSIUtil – where I can at least erase _all_ mappings.
>
>
>> functionality should be properly documented in the driver.
>
> Thanks for your continuous supprt/help/improvements!
>
> -Harry

We've run into a situation where we had to use this fallback logic
too.  In our case, it happened when upgrading mpr HBA firmware from
14.0.0.0 to 15.0.0.0, which corrupted the internal mapping tables.  I
fixed the resulting panic in r325363, but that exposed a problem with
the fallback logic.  Since the fallback logic uses the phy number as
the target ID, it doesn't work on SAS busses with more than one
expander.  In that case, multiple drives can get the same target ID,
and only the first is usable.  In our codebase, I fixed this by
setting the id to ((config_page.EnclosureHandle - 1) << 7) |
config_page.Slot; .  That formula will work for enclosures of up to
128 slots and 8 enclosures.  However, it will obviously fail if the
enclosure assigns the same Slot to multiple drives.  That sounds like
the case for Scott.  Are there any alternatives I'm missing that would
satisfy everyone?

-Alan


More information about the svn-src-head mailing list